facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.53k stars 6.41k forks source link

mBart cc25 extract features doesn't work #2178

Open shonenkov opened 4 years ago

shonenkov commented 4 years ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

  1. load https://dl.fbaipublicfiles.com/fairseq/models/mbart/mbart.CC25.tar.gz
  2. extract tar.gz
  3. use python:
    
    from fairseq.models.bart import BARTModel
    model = BARTModel.from_pretrained(model_name_or_path='./mbart.CC25', layernorm_embedding=True)
    model.eval();
    tokens = model.encode('Hello world!')
    tokens
    >>> tensor([     0, 250029, 152261,   2389,      2])

features = model.extract_features(tokens.unsqueeze(0))


RuntimeError Traceback (most recent call last)

in ----> 1 features = model.extract_features(tokens.unsqueeze(0))

/usr/local/lib/python3.6/site-packages/fairseq/models/bart/hub_interface.py in extract_features(self, tokens, return_all_hiddens) 157 prev_output_tokens=prev_output_tokens, 158 features_only=True, --> 159 return_all_hiddens=return_all_hiddens, 160 ) 161 if return_all_hiddens:

/usr/local/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, kwargs) 530 result = self._slow_forward(*input, *kwargs) 531 else: --> 532 result = self.forward(input, kwargs) 533 for hook in self._forward_hooks.values(): 534 hook_result = hook(self, input, result)

/usr/local/lib/python3.6/site-packages/fairseq/models/bart/model.py in forward(self, src_tokens, src_lengths, prev_output_tokens, features_only, classification_head_name, kwargs) 74 src_tokens, 75 src_lengths=src_lengths, ---> 76 kwargs, 77 ) 78 x, extra = self.decoder(

/usr/local/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, kwargs) 530 result = self._slow_forward(*input, *kwargs) 531 else: --> 532 result = self.forward(input, kwargs) 533 for hook in self._forward_hooks.values(): 534 hook_result = hook(self, input, result)

/usr/local/lib/python3.6/site-packages/fairseq/models/transformer.py in forward(self, src_tokens, src_lengths, return_all_hiddens) 404 Only populated if return_all_hiddens is True. 405 """ --> 406 x, encoder_embedding = self.forward_embedding(src_tokens) 407 408 # B x T x C -> T x B x C

/usr/local/lib/python3.6/site-packages/fairseq/models/transformer.py in forward_embedding(self, src_tokens) 367 def forward_embedding(self, src_tokens): 368 # embed tokens and positions --> 369 x = embed = self.embed_scale * self.embed_tokens(src_tokens) 370 if self.embed_positions is not None: 371 x = embed + self.embed_positions(src_tokens)

/usr/local/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, kwargs) 530 result = self._slow_forward(*input, *kwargs) 531 else: --> 532 result = self.forward(input, kwargs) 533 for hook in self._forward_hooks.values(): 534 hook_result = hook(self, input, result)

/usr/local/lib/python3.6/site-packages/torch/nn/modules/sparse.py in forward(self, input) 112 return F.embedding( 113 input, self.weight, self.padding_idx, self.max_norm, --> 114 self.norm_type, self.scale_grad_by_freq, self.sparse) 115 116 def extra_repr(self):

/usr/local/lib/python3.6/site-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse) 1482 # remove once script supports set_grad_enabled 1483 _no_grad_embeddingrenorm(weight, input, max_norm, norm_type) -> 1484 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) 1485 1486

RuntimeError: index out of range: Tried to access index 250029 out of table with 250026 rows. at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:418

<!-- If you have a code sample, error messages, stack traces, please provide it here as well -->

#### Code sample
<!-- Ideally attach a minimal code sample to reproduce the decried issue. 
Minimal means having the shortest code but still preserving the bug. -->

### Expected behavior

getting features without exception

<!-- A clear and concise description of what you expected to happen. -->

### Environment

 - fairseq Version (e.g., 1.0 or master): master
 - PyTorch Version (e.g., 1.0): 1.4.0
 - OS (e.g., Linux): Ubuntu
 - How you installed fairseq (`pip`, source): source
 - Build command you used (if compiling from source):
```python
!git clone https://github.com/pytorch/fairseq
!pip install --no-deps './fairseq'

Additional context

renqingcolin commented 4 years ago

you need use sentencepiece. model = BARTModel.from_pretrained(model_name_or_path='pretrained_model/mbart.cc25',bpe="sentencepiece",sentencepiece_model='pretrained_model/mbart.cc25/sentence.bpe.model', layernorm_embedding=True)