Open unmo opened 1 year ago
+1
@unmo @crazycth Hello, any update? I also encounter this problem.
Sorry, I have not been able to resolve it yet either.
Hi,
--conv-mode llava_v1
?I'm having the same issue: when I train the model to run the run_llava.py, I also get this error, I find that the output is all NAN, but I don't know why it's happening
Facing the same issue here. The output is nan although my w&B loss looks fine.
Update: I was able to resolve the issue by changing the base model from hugging face's "llava-hf/llava-1.5-7b-hf"to "liuhaotian/llava-v1.5-7b". It resolved the NaN issue and the training performance got much better.
Describe the issue
Issue: I executed cli script in the following command. I have encountered a problem "RuntimeError: probability tensor contains either
inf
,nan
or element < 0".Using model is a trained on custom data. What is the problem?
Command:
Log: