Open SleepingSkipper opened 1 year ago
This is a known issue with HF transformers, depending on the padding the values can slightly change. For downstream applications it has no impact, as the differences are too small.
I am linking some related issues... #1356 #1883
@nreimers In the issue with @SleepingSkipper's input all the sentences have same length. Would it still be possible to have different embedding? According to my understanding, all the sentences should pad to length of largest sentence, right?
Also, any comments on https://github.com/UKPLab/sentence-transformers/issues/1883#issuecomment-1561963647_ ?
Thanks.
It is expected that
res[0]==res[3]==res[4]
, but in factres[0]==res[3]
,res[0]!=res[4]
It seems that when I set batch_size=1,
res[0]==res[3]==res[4]
. What makes the difference? Is there any workaround to get exact same result with batch size other than 1?