huggingface / trl

Train transformer language models with reinforcement learning.
http://hf.co/docs/trl
Apache License 2.0
10.14k stars 1.28k forks source link

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #2205

Open himanshushukla12 opened 1 month ago

himanshushukla12 commented 1 month ago

System Info

System Info

Environment Details:

Information

Tasks

: what is AI? : : what is AI? : ?',?', ?
himanshushukla12 commented 1 month ago

Issue occurs when we have multi-GPU in our system. If we use CUDA_VISIBLE_DEVICE=0 it works fine