The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
Some weights of LSGXLMRobertaModel were not initialized from the model checkpoint at T-Systems-onsite/cross-en-es-roberta-sentence-transformer and are newly initialized: ['embeddings.global_embeddings.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
<class 'lsg_converter.xlm_roberta.modeling_lsg_xlm_roberta.LSGXLMRobertaModel'>
but when i use the model with a long text (is and embedding model), i get:
{
"name": "RuntimeError",
"message": "The expanded size of the tensor (1193) must match the existing size (514) at non-singleton dimension 1. Target sizes: [2, 1193]. Tensor sizes: [1, 514]",
"stack": "---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/home/dario/src/lsg_embeddings.ipynb Cell 3 line 3
<a href='vscode-notebook-cell:/home/dario/src/lsg_embeddings.ipynb#W2sZmlsZQ%3D%3D?line=28'>29</a> # Compute token embeddings
<a href='vscode-notebook-cell:/home/dario/src/lsg_embeddings.ipynb#W2sZmlsZQ%3D%3D?line=29'>30</a> with torch.no_grad():
---> <a href='vscode-notebook-cell:/home/dario/src/lsg_embeddings.ipynb#W2sZmlsZQ%3D%3D?line=30'>31</a> model_output = model(**encoded_input)
<a href='vscode-notebook-cell:/home/dario/src/lsg_embeddings.ipynb#W2sZmlsZQ%3D%3D?line=32'>33</a> # Perform pooling. In this case, max pooling.
<a href='vscode-notebook-cell:/home/dario/src/lsg_embeddings.ipynb#W2sZmlsZQ%3D%3D?line=33'>34</a> sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
File /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
File /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
File ~/.local/lib/python3.10/site-packages/transformers/models/roberta/modeling_roberta.py:801, in RobertaModel.forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict)
799 if hasattr(self.embeddings, \"token_type_ids\"):
800 buffered_token_type_ids = self.embeddings.token_type_ids[:, :seq_length]
--> 801 buffered_token_type_ids_expanded = buffered_token_type_ids.expand(batch_size, seq_length)
802 token_type_ids = buffered_token_type_ids_expanded
803 else:
RuntimeError: The expanded size of the tensor (1193) must match the existing size (514) at non-singleton dimension 1. Target sizes: [2, 1193]. Tensor sizes: [1, 514]"
Hi, i'm trying to convert this model:
and seems to be converted ok:
but when i use the model with a long text (is and embedding model), i get:
With bert models works like a sharm