Token indices sequence length is longer than the specified maximum sequence length for this model

timoschick / pet

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

https://arxiv.org/abs/2001.07676

Apache License 2.0

1.62k stars 282 forks source link

Token indices sequence length is longer than the specified maximum sequence length for this model #87

Open Claymore715 opened 2 years ago

Claymore715 commented 2 years ago

below is my parameters,how counld I fix it

!python "/content/drive/MyDrive/pet-master/cli.py" --method pet \ --pattern_ids 0 1 2 3 4\ --data_dir "/content/drive/MyDrive/pet-master/data/yahoo_answers_csv" \ --model_type roberta \ --model_name_or_path roberta-large \ --task_name yahoo \ --output_dir "/content/drive/MyDrive/pet-master/output_dir" \ --do_train \ --do_eval \ --pet_max_seq_length 1500 \ --sc_max_seq_length 1500

1jiahe commented 1 year ago

Hi, I got the same problem. May I ask whether you resolved this issue? And how? Thanks a lot

thatmee commented 1 year ago

Hi, I got the same problem and I solve it by adding Truncation=True and max_length=[max_len] in two encode functions: Snipaste_2023-03-07_09-30-56

As far as I can see this is just a warning message to inform you that the input text is too long, but it doesn't affect the training result, only the output is a bit annoying.