tensorflow / hub

A library for transfer learning by reusing parts of TensorFlow models.
https://tensorflow.org/hub
Apache License 2.0
3.49k stars 1.67k forks source link

Inconsistent outputs from the CMLM (en-base) model. #834

Closed Nilabhra closed 2 years ago

Nilabhra commented 2 years ago

After loading the CMLM model (available here: https://tfhub.dev/google/universal-sentence-encoder-cmlm/en-base/1), if the pooled_output is obtained on the same sequences repeatedly, I can see variations in the output embeddings.

This colab notebook replicates this issue: https://colab.research.google.com/drive/1iUUwNBQWaWJRgZ1ExMuzNhJyEfWCGyRJ?usp=sharing

sayakpaul commented 2 years ago

I can confirm that this behavior is not present in other sentence encoders. Here's a Colab that verifies that: https://colab.research.google.com/gist/sayakpaul/c59d855e14a98a93362d3735ea67e6d2/scratchpad.ipynb.

sayakpaul commented 2 years ago

Ccing @WGierke

maringeo commented 2 years ago

Thank you @Nilabhra for filing this issue and @sayakpaul for the Colab! I'm not sure what is causing this so I reached out to the model authors. I'll update the thread once they respond.

maringeo commented 2 years ago

The model authors replied that the difference in embeddings is very small and likely due to some numerical instability of some underlying op. The difference should not affect any downstream usages, so there is no plan to fix this.

I'm going to mark this issue as closed, but feel free to reopen or comment if you have any follow-up questions.

sayakpaul commented 2 years ago

This is very unlikely. As mentioned here this is not a problem with other sentence encoders.

maringeo commented 2 years ago

The sentence encoders from the Colab are using PyTorch (at least I think so by looking at the output of pip install -U sentence-transformers). https://tfhub.dev/google/universal-sentence-encoder-cmlm/en-base/1 uses TensorFlow. This is a significant difference so this might explain why the rest of the models are numerically stable.

sayakpaul commented 2 years ago

Yeah, you are absolutely right. But numerical instability is indeed a point of concern. We run a few more experiments with https://tfhub.dev/google/universal-sentence-encoder-cmlm/en-base/1 to verify if this numerical instability leads to performance degradation.

Will you be able to communicate if something concerning comes up?

maringeo commented 2 years ago

Absolutely, in case you spot any performance degradation please report it - we'll forward the concerns to the model authors.