philschmid commented 2 months ago

What does this PR do?

This PR adds a new, slightly modified FeatureExtractionPipeline from Transformers that allows us to use it with sentence-transformers models. When using the pipeline object from optimum,, the library checks if the requested model for feature-extraction is a sentence-transformers model and if so, it would return the sentence_embeddings instead of the first hidden state.

Thats is done by adding a new is_sentence_transformer_model that checks if the requested model is a transformers or sentence-transformers model. If it is a sentence-transformers model, it uses NeuronModelForSentenceTransformers and the FeatureExtractionPipeline returns _model_outputs.sentence_embedding[0] instead of model_outputs[0]

Example:

from optimum.neuron import pipeline

input_shapes = {"batch_size": 1, "sequence_length": 64} 
p = pipeline("feature-extraction","sentence-transformers/all-MiniLM-L6-v2",export=True, input_shapes=input_shapes)
> # Using Sentence Transformers compatible Feature extraction pipeline

p("test")
> [0.06765521317720413,
> 0.06349243223667145,
> 0.04871273413300514,
> 0.0793028473854065,

Validated with torch.allclose

Implications:

sentence-transformers models will now always return the sentence_embeddings when initialized with the FeatureExtractionPipeline pipeline.

Alternatives options:

Instead of modifying the feature-extraction pipeline, we could also introduce a new task sentence-embeddings to optimum, but that might hinder more general adoption since it is unique to optimum-neuron.

philschmid commented 2 months ago

@tomaarsen can you also do a review?

HuggingFaceDocBuilderDev commented 2 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

huggingface / optimum-neuron

[Inference] Add `SentenceTransformers` support to `pipeline` for `feature-extration` #583

What does this PR do?