Not able to export sentence-transformers model to PyTorch.

sivers2021 commented 3 years ago

Hi,

I would like to export sentence-transformers model to PyTorch. However, I am not able to jit trace the stsb-distilbert-base model. Any help is much appreciated. Thanks, -s

sentence-transformers (Version: 0.4.1.2) torch (version 1.8.0) python 3.6.7

from sentence_transformers import SentenceTransformer
import torch

model = SentenceTransformer('stsb-distilbert-base', device='cpu')

model.eval()
batch_size=1
max_seq_length=128
device = torch.device("cpu")
model.to(device)
input_ids = torch.zeros(batch_size, max_seq_length, dtype=torch.long).to(device)
input_type_ids = torch.zeros(batch_size, max_seq_length, dtype=torch.long).to(device)
input_mask = torch.zeros(batch_size, max_seq_length, dtype=torch.long).to(device)
input_features = {'input_ids': input_ids, 'token_type_ids': input_type_ids, 'attention_mask': input_mask}

traced_model = torch.jit.trace(model, example_inputs=(input_ids, input_type_ids, input_mask, input_features))

Traceback (most recent call last): File "/Users/sivers/xformers/githubissue.py", line 16, in traced_model = torch.jit.trace(model, example_inputs=(input_ids, input_type_ids, input_mask, input_features)) File "/Users/sivers/xformers/nlpenv/lib/python3.6/site-packages/torch/jit/_trace.py", line 742, in trace _module_class, File "/Users/sivers/xformers/nlpenv/lib/python3.6/site-packages/torch/jit/_trace.py", line 940, in trace_module _force_outplace, File "/Users/sivers/xformers/nlpenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 887, in _call_impl result = self._slow_forward(*input, *kwargs) File "/Users/sivers/xformers/nlpenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 860, in _slow_forward result = self.forward(input, **kwargs) TypeError: forward() takes 2 positional arguments but 5 were given

nreimers commented 3 years ago

I have not yet worked with jit trace.

The forward function only takes a dict as feature, i.e. you input_features. The input_ids, input_type_ids, input_mask in your example_inputs are not needed and causes the error.

sivers2021 commented 3 years ago

@nreimers Thank you for looking into it. torch.jit.trace takes tuple, or list of tensors. It throws errors regardless

from sentence_transformers import SentenceTransformer
import torch

model = SentenceTransformer('stsb-distilbert-base', device='cpu')

model.eval()
batch_size=1
max_seq_length=128
device = torch.device("cpu")
model.to(device)
input_ids = torch.zeros(batch_size, max_seq_length, dtype=torch.long).to(device)
input_type_ids = torch.zeros(batch_size, max_seq_length, dtype=torch.long).to(device)
input_mask = torch.zeros(batch_size, max_seq_length, dtype=torch.long).to(device)
input_features = (input_ids, input_type_ids, input_mask)

traced_model = torch.jit.trace(model, example_inputs=input_features)

/Users/sivers/xformers/nlpenv/bin/python3 /Users/sivers/xformers/githubissue.py Traceback (most recent call last): File "/Users/sivers/xformers/githubissue.py", line 17, in traced_model = torch.jit.trace(model, example_inputs=input_features) File "/Users/sivers/xformers/nlpenv/lib/python3.6/site-packages/torch/jit/_trace.py", line 742, in trace _module_class, File "/Users/sivers/xformers/nlpenv/lib/python3.6/site-packages/torch/jit/_trace.py", line 940, in trace_module _force_outplace, File "/Users/sivers/xformers/nlpenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 887, in _call_impl result = self._slow_forward(*input, *kwargs) File "/Users/sivers/xformers/nlpenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 860, in _slow_forward result = self.forward(input, **kwargs) TypeError: forward() takes 2 positional arguments but 4 were given

nreimers commented 3 years ago

According https://pytorch.org/docs/stable/generated/torch.jit.trace.html it also takes dictionaries.

As mentioned, forward accepts only dictionaries. So you must pass a dict.

joshdevins commented 3 years ago

@sivers2021 You will want to pass a dictionary inside a tuple. This satisfies both interfaces.

torch.jit.trace(model, ({'input_ids': input_ids, 'attention_mask': attention_mask}), strict=False)

See this line which I what I believe @nreimers is referring to https://github.com/UKPLab/sentence-transformers/blob/f7b4af5db2edcd0ad9cd878d6b8442abd776dd5c/sentence_transformers/models/Transformer.py#L47

dhrubo-os commented 1 year ago

We are tracing SentenceTransformers model using this save_as_pt method.

But we recently found that, our traced model won't accept a doc with a token length exceeding 512.

We are facing error like: RuntimeError: The size of tensor a (650) must match the size of tensor b (512) at non-singleton dimension 1",

Can I get any direction what do I need to do to make sure the traced model also supports truncation?

@nreimers @joshdevins

Thanks

kalaracey commented 4 months ago

For anyone else who is trying this, I managed to get basic tracing to work like so:

import numpy as np
import torch
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

# Our sentences to encode
sentences = [
    "This framework generates embeddings for each input sentence",
    "Sentences are passed as a list of string.",
    "The quick brown fox jumps over the lazy dog."
]

# Sentences are encoded by calling model.encode()
embeddings = model.encode(sentences)

tokens = model.tokenize(sentences)
tokens = {k : v.to('mps') for (k,v) in tokens.items()}

# strict=False is necessary to avoid some warnings about returning a dict from forward.
traced_encode = torch.jit.trace_module(model, {'forward': tokens}, strict=False)
traced_embeddings = traced_encode(tokens)['sentence_embedding'].cpu().detach().numpy()

for sentence, embedding, traced_embedding in zip(sentences, embeddings, traced_embeddings):
    print("Sentence:", sentence)
    print("Max diff between embedding and traced_embedding: ", np.max(embedding - traced_embedding))
    print("")

which outputs

/Users/kal/Code/sentence-transformers/.venv/lib/python3.9/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
`SentenceTransformer._target_device` has been removed, please use `SentenceTransformer.device` instead.
`SentenceTransformer._target_device` has been removed, please use `SentenceTransformer.device` instead.
/Users/kal/Code/sentence-transformers/.venv/lib/python3.9/site-packages/transformers/modeling_utils.py:4371: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
  warnings.warn(
`SentenceTransformer._target_device` has been removed, please use `SentenceTransformer.device` instead.
`SentenceTransformer._target_device` has been removed, please use `SentenceTransformer.device` instead.
Sentence: This framework generates embeddings for each input sentence
Max diff between embedding and traced_embedding:  0.0

Sentence: Sentences are passed as a list of string.
Max diff between embedding and traced_embedding:  4.0978193e-08

Sentence: The quick brown fox jumps over the lazy dog.
Max diff between embedding and traced_embedding:  5.2154064e-08

Note that I'm on a Mac, so if you're not you might have to change the mps business.

For the curious, here's what the trace looks like.

Note that the traced forward does not do truncation, nor tokenization, which occur in encode before the call to forward.

rogergheser commented 3 months ago

I'm also getting the same warning when calling encode.

FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn(

I've searched for a force_download option but that is nowhere to be found.

This is the code causing the warning ` model = SentenceTransformer('sentence-transformers/stsb-xlm-r-multilingual')

embeddings1 = model.encode(sentences[0], convert_to_tensor=True, show_progress_bar=False)

embeddings2 = model.encode(sentences[1], convert_to_tensor=True, show_progress_bar=False) `

tomaarsen commented 3 months ago

Hello!

This is caused by a recent huggingface_hub update, which has indeed deprecated a resume_download option somewhere in their code. This option was being used in transformers, but no longer since https://github.com/huggingface/transformers/pull/30620. However, that change has not yet been released, so loading any model with the most recent transformers and huggingface_hub versions now annoyingly shows

FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.

The tl:dr is that this warning will go away with the next transformers release, that you can avoid it by slightly downgrading your huggingface_hub version, and that you don't have to worry about this warning. Everything still works as intended.

With other words, your code looks good & it should work correctly. Apologies for the confusion.

Tom Aarsen

UKPLab / sentence-transformers

Not able to export sentence-transformers model to PyTorch. #820