Is it expected the same sentence gives different features?

I'm a bit puzzled by something I encountered trying to encode sentences as embeddings. When I ran the sentences through the model one at a time, I got slightly different results from when I ran batches of sentences.

I've reduced an example down to:

from transformers import pipeline
import numpy as np

p = pipeline('feature-extraction', model='allenai/scibert_scivocab_uncased')
s = 'the scurvy dog walked home alone'.split()

for l in range(1,len(s)+1):
    txt = ' '.join(s[:l])

    res1 = p(txt)
    res2 = p(txt)
    res1_2 = p([txt, txt])
    print(l, txt, len(res1[0]))
    print(all( np.allclose(i, j) for i, j in zip(res1[0], res2[0])),
          all( np.allclose(i, j) for i, j in zip(res2[0], res1_2[0])),
          all( np.allclose(i, j) for i, j in zip(res1_2[0], res1_2[1])))

The output I get is:

1 the 3
True False True
2 the scurvy 6
True True True
3 the scurvy dog 7
True False False
4 the scurvy dog walked 9
True False True
5 the scurvy dog walked home 10
True True False
6 the scurvy dog walked home alone 11
True True True

So running a single sentence through the model seems to give the same output each time, but if I run a batch with the same sentence twice, it's sometimes different (between the two outputs, and compared to the single-sentence case)

Is this expected/explainable?

Further context, I'm running it on CPU (laptop), python 3.8.9, freshly installed venv. The difference is usually just in a few indices of the embeddings, and can be up to 1e-3. The difference is negligible when comparing the embeddings with cosine distance. But I'd like to understand where it comes from before dismissing it.

allenai / scibert

Is it expected the same sentence gives different features? #115