Open Bachstelze opened 3 months ago
Hey IndexError: index out of range in self
this is usually when the tokenizer's max encoded token id does not match the size of the embedding matrix. The cuda error might be asyncronus. Print the input ids
Yes, the encoded tokens don't match the size of the embedding matrix. The training was completed by reducing the filtered dataset to e.g. 510 tokens. Inferencing such a trained model throws the stated RuntimeError. Which input ids should I print?
Just what ever gets passed to the model. You might not have resized correctly. model.resize_embedding()
should work
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Why do I have to resize the model if the original size is used? The smaller input size is just for the dataset filtering.
Is there somewhere a documentation about model.resize_embedding()
?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Thanks for the hint @ArthurZucker ! I will try it in the next development iteration.
The stale-bot is annoying. Couldn't it provide useful information and suggestions? And only close the issue if those aren't addressed?
System Info
private setup:
transformers
version: 4.35.2Google Colab setup:
transformers
version: 4.38.2Who can help?
@ArthurZucker @patrickvonplaten
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Those are the training steps for an instruction-tuned shared EncoderDecoder xlm-r model:
python3 instructionbert/load_multilingual_dataset.py "FacebookAI/xlm-roberta-base" 512 False
python3 test_train_multilingual.py "FacebookAI/xlm-roberta-base" 512 True "adamw_bnb_8bit" 1 8 50000 0.0001 1 1
those ordered training parameters are changeable: model_name, maximal_length, bool_fp16, optimizer, dataloader_workers, batch_size, warmup, lr, accumulation, epochsExpected behavior
The XLM-R model is supported as encoderDecoderModel, so it should train just like other smaller mBERT or monolingual Roberta models.
This "IndexError: index out of range in self" is thrown by training it with matching context size of the filtered dataset and model:
In google colab this
CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasLtMatmul
is thrown.The training completes by reducing the filtered dataset to e.g. 510 tokens. Inferencing such a model throws this RuntimeError: