what's the difference between USE and then SBERT?

Cumberbatch08 commented 4 years ago

First, many thanks to your paper and code. But I read the universal sentence encoder(USE) paper, the architecture is like simaese network, they also used the SNLI dataset. But your result is well performed. So I'm very interested in your work.

nreimers commented 4 years ago

Hi @Cumberbatch08 sadly the USE papers (at least the ones I know) are extremely high-level, not going really into the details. So it is unclear which architecture they exactly used and how the training was done (exact datasets, exact loss function etc.)

Differences:

USE and SBERT both use transformer networks. For USE, it is sadly not clear how many layers they use (most technical details are not provided). USE was trained from scratch (as far as I can tell from the paper), while SBERT uses the BERT / RoBERTa pre-trained wights and just fine-tunes them to produce sentence embeddings.
I think the main difference is in the pre-training. USE uses a wide variety of data sets (exact details not provided), specifically target for generating sentence embeddings. BERT was pre-trained on a book corpus and on Wikipedia for producing a language model (see the BERT paper). SBERT than fine-tunes BERT to produce sensible sentence embeddings.
USE is in TensorFlow and tuning for your use-case is not straightforward (source code not available, you only get the compiled model from tensorflow-hub). SBERT is based on pytorch and the goal of this repository is, that fine-tuning for your use-case is as simple as possible.

Cumberbatch08 commented 4 years ago

haha, yes, absolutely agreed what you said. The USE don't public much more details, such as the layers, dataset, loss etc. I get some information about the architecture: Just as you said, maybe the pretraining is important.

Gurutva commented 2 years ago

what would be best USE (https://tfhub.dev/google/universal-sentence-encoder/4) or SBERT models (https://huggingface.co/sentence-transformers) for good semantic search results ?

nreimers commented 2 years ago

@Gurutva SBERT works much better: https://arxiv.org/pdf/2104.08663v1.pdf

UKPLab / sentence-transformers

what's the difference between USE and then SBERT? #64