RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

https://www.modelscope.cn/studios/damo/mPLUG-Owl

MIT License

2.25k stars 171 forks source link

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #98

Closed nullnameno closed 1 year ago

nullnameno commented 1 year ago

Sorry, I encountered an issue while performing video inference, which is RuntimeError: probability tensor contains either 'inf', 'nan' or element<0. I am very curious about the reason for this error.

MAGAer13 commented 1 year ago

Did you run this demo under bfloat16?

MAGAer13 commented 1 year ago

Since the model is training under bfloat16, while the DepthwiseConv3D only support half, so we do the conversion from bf16 to half during Conv then we convert it back which leads to the unstable results.

LinB203 commented 1 year ago

Sorry, I encountered an issue while performing video inference, which is RuntimeError: probability tensor contains either 'inf', 'nan' or element<0. I am very curious about the reason for this error.

Same question and just use demo code.

nullnameno commented 1 year ago

Did you run this demo under bfloat16?

Yes, I strictly followed the example you provided in README.md for video reasoning. I don't know why the model reported such an error.

ff1Zzd commented 1 year ago

Same error as well. Follow the demo code strictly

ff1Zzd commented 1 year ago

Hi I have also tried to use the model (with video weights) to inference an image input, however I am getting an output with all special tokens. I believe it might be caused by the same bug as the one raised by nullnameno. Here is the screenshot of the output.

sdjhshbswp commented 1 year ago

Hi I have also tried to use the model (with video weights) to inference an image input, however I am getting an output with all special tokens. I believe it might be caused by the same bug as the one raised by nullnameno. Here is the screenshot of the output.

I meet the same problem when I run the inference_video.py. I also get the res tensor of all zero.

MAGAer13 commented 1 year ago

See #101