huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.51k stars 26.9k forks source link

[Error] with Trainer: TypeError: Unsupported types (<class 'NoneType'>) passed to `_gpu_broadcast_one`. #32090

Closed halixness closed 1 month ago

halixness commented 3 months ago

System Info

Who can help?

@muellerzr @SunMarc @ArthurZucker

Information

Tasks

Reproduction

https://gist.github.com/halixness/eadd6d1d89ae48597f70cb09f2b44139

Expected behavior

Hello, I have written a simple training script to train from scratch a gpt2-like model with a large dataset of strings (molecules in SMILES format). After around ~2k steps (batch_size=128, #samples = ~1.5M), I encounter the following error:

TypeError: Unsupported types (<class 'NoneType'>) passed to `_gpu_broadcast_one`. Only nested list/tuple/dicts of objects that are valid for `is_torch_tensor` should be passed.

I tried already:

I'm not sure about what could case this error. Any suggestion is much appreciated!

ArthurZucker commented 3 months ago

Hey! One thing to check is if you correctly set the tokenizer.unk_token (from the gist it does not seem like it), which could produce Non inputs when it finds something outside your vocabulary. As the rest of the training seems to work as expected, really think it's related to tokenization here!

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.