UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.78k stars 2.43k forks source link

Selecting layer with SBERT model #906

Closed thak123 closed 3 years ago

thak123 commented 3 years ago

Hi

I am using SentenceTransformer(NAMEOFMODEL) function to load the sentence transformer model.

How can I choose a specific layer embedding ? I know Automodel with Pooling layer and WeightedLayerPooling layer can be used but can SentenceTransformer be directly used >

nreimers commented 3 years ago

This is not so easy to achieve.

Option 1: You create a new network from scratch: https://www.sbert.net/docs/training/overview.html#creating-networks-from-scratch

And use the WeightedLayerPooling and specify which layer you want.

Option 2: You enable in the Transformer model that all layer embeddings are returned. These are then stored in all_layer_embeddings and could potentially be returned by the encode method when you set the output value to 'all_layer_embeddings'

thak123 commented 3 years ago

Does encode method accept output_value = 'all_layer_embeddings'. I did not find any documentation on this part.

thak123 commented 3 years ago
[word_embedding_model = models.Transformer("xlm-roberta-large", max_seq_length=512)
word_embedding_model.auto_model = zs.model.roberta
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())
layers = models.WeightedLayerPooling(word_embedding_model.get_word_embedding_dimension(),num_hidden_layers = 24, layer_start= 12)
model = SentenceTransformer(modules=[word_embedding_model,layers,  pooling_model], device = 0)
model.encode("An example")]

I saw this code in other post. Is WeightedLayerPooling is required to be paired with pooling_model ?

embed_model ="LaBSE"
word_embedding_model = models.Transformer(
                "sentence-transformers/{}".format(embed_model), model_args={"output_hidden_states": True})
            # word_embedding_model.auto_model = zs.model.roberta
            pooling_model = models.Pooling(
                word_embedding_model.get_word_embedding_dimension())
            print(word_embedding_model.get_word_embedding_dimension())
            layers = models.WeightedLayerPooling(
                word_embedding_model.get_word_embedding_dimension(), num_hidden_layers=12)  # 
            embedder = SentenceTransformer(
                modules=[word_embedding_model,pooling_model], device=config.device)
            a = embedder.encode("An example")
            embedder2 = SentenceTransformer(
                modules=[word_embedding_model, pooling_model,layers], device=config.device)
            b = embedder2.encode("An example")
            print(a[0],b[0])

I found both the vectors a and b are same even though the modules differ.

thak123 commented 3 years ago

My fault I placed the pooling model in the middle which has to be at the end. I am now getting different representations. What should be specified in the WeightedLayerPooling to just select a particular layer for either sentence or token representation.

For ex. like if I want only 8 layer sentence/token representation from the model.

thak123 commented 3 years ago

@nreimers https://github.com/UKPLab/sentence-transformers/blob/9433e9b3d4d88ab286f6108f1cd9ee966a263a35/sentence_transformers/models/WeightedLayerPooling.py#L25 would having
all_layer_embedding = all_layer_embedding[self.layer_start:self.layer_end, :, :, :] # Start from 4th layers output help me get the specified layer average ?

nreimers commented 3 years ago

Yes, this would be an option