[x] the official example scripts: (give details below)
[ ] my own modified scripts: (give details below)
The tasks I am working on is:
[ ] an official GLUE/SQUaD task: (give the name)
[x] my own task or dataset: (give details below)
To reproduce
from transformers import *
processor = SingleSentenceClassificationProcessor()
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
processor.add_examples(["Thanks for cool stuff!"])
processor.get_features(tokenizer, max_length=3)
Truncation was not explicitely activated but `max_length` is provided a specific value, please use `truncation=True` to explicitely truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
[InputFeatures(input_ids=[101, 4283, 102], attention_mask=[1, 1, 1], token_type_ids=None, label=0)]
Expected behavior
This is expected, but the problem is that there is no way to suppress that warning, because there is no way to pass truncation=True when tokenizer.encode is called within processor.get_features. Probably one should make truncation an argument to processor.get_features.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Environment info
transformers
version: 3.1.0Who can help
@LysandreJik, @thomwolf
Information
Model I am using: BERT.
The problem arises when using:
The tasks I am working on is:
To reproduce
Expected behavior
This is expected, but the problem is that there is no way to suppress that warning, because there is no way to pass
truncation=True
whentokenizer.encode
is called withinprocessor.get_features
. Probably one should maketruncation
an argument toprocessor.get_features
.