nan loss - Githubissues

hangzeli08 commented 1 year ago

OwlEval里面的数据集82条数据，微调训练一下，loss不为nan，把这82条复制一下，变成164条，loss为nan，在82条基础上，加上几条短的数据，马上变成nan，这次还要说是长度的问题吗？很明显和长度无关，请修复你们的模型，问题现在看来的确存在，是模型的问题。

MAGAer13 commented 1 year ago

Would you mind sharing the data file you used for training? The OwlEval is only used for evaluation not for training.

hangzeli08 commented 1 year ago

我用了 1.owleval转化为你们规定格式的数据 2.llava_instruct_150k.json转化为你们规定格式的数据 3.detail_23k.json转化为你们规定格式的数据如果只用OwlEval其中一份数据，loss不为nan，合并两个数据，loss马上为nan。

{"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/1.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:What is funny about this image? Describe it panel by panel.\nAI:The image shows a person holding a cell phone with a charging cable attached to it. The person is wearing a blue shirt and has a white background. The image is taken from a low angle, showing the person’s face and the cell phone in their hand. The charging cable is plugged into the phone’s charging port. The image is well-lit and clear, with good contrast and detail.\n\nThe image is funny because it shows a person using a charging cable to charge their phone while they are wearing a blue shirt. It is unusual to see someone using a charging cable while they are wearing a shirt, as it is more common to see people using charging cables while they are wearing casual clothing such as shorts or a t-shirt. The image is also funny because it shows the person holding their phone with one hand while the other hand is holding the charging cable. This is an unusual way to hold a phone while charging it, as it is more common to see people using both hands to hold their phone while it is charging. The image is also funny because it shows the person holding their phone with one hand while the other hand is holding the charging cable. This is an unusual way to hold a phone while charging it, as it is more common to see people using both hands to hold their phone while it is charging.", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/2.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:Why would a person find this image funny?\nAI: This image is funny because it depicts a squirrel taking a selfie with a camera. The squirrel is holding the camera with its paws and looking at the lens with a smirk on its face, as if it is proud of itself for taking a good picture. The image is also humorous because it shows a squirrel, which is not typically associated with taking pictures, using a camera.\n\nThe image is also funny because it is a cartoon, which adds to the humor by exaggerating the squirrel’s features and making it appear more playful and whimsical. The use of the camera as a prop adds to the humor by making the squirrel appear more human-like and capable of using technology.\n\nOverall, this image is funny because it combines the unexpected (a squirrel taking a selfie) with the exaggerated (the squirrel’s expression and the use of the camera as a prop) to create a humorous and playful image.", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/3.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:What is unusual about this image?\nAI: The image shows a man standing on the back of a yellow taxi, with a yellow shirt and black pants, and a blue backpack on his back. The taxi is driving on a city street with other cars and taxis in the background.", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/4.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:Is there text in the image? If so, can you read it and explain it to me? \nAI: The image shows a cartoon of a man and a woman standing next to each other, with the man holding his head in his hand and looking sad. The woman is looking at him with a concerned expression on her face. The caption reads, “Before: I don’t feel well today. Do you have any medicine? After: I feel much better now. Thank you for taking care of me.” \n", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/4.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:What is funny about it?\nAI:The humor in this comic strip comes from the contrast between the man’s initial expression of feeling unwell and the woman’s concern for him, followed by the man’s sudden recovery and the woman’s relief. The punchline, “Thank you for taking care of me,” adds to the humor by highlighting the woman’s efforts to care for the man, even though he didn’t seem to need it at first.", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/5.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:Where is the frisbee in the image? \nAI:The frisbee is in the air, being caught by the man in red.\n", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/5.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:Where is the man in the image?\nAI:The man is in the air, catching the frisbee. \n", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/5.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:What is the spatial relation between the frisbee and the man?\nAI:The frisbee is above the man, being caught by him.", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/6.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:What should I do after kneading the dough, before dividing it into pieces?\nAI:After kneading the dough, you should let it rise for about 1 hour before dividing it into pieces. This allows the dough to become more elastic and easier to work with. You can also use a stand mixer to knead the dough, which will make the process faster and easier. Once the dough has risen, you can divide it into the desired number of pieces and shape each piece into a ball. Then, let the balls rise for another 30 minutes to an hour before baking. \n", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/6.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:How long should I put in warm places?\nAI:The recommended time for letting the dough rise in warm places is about 1 hour. This allows the yeast to activate and the dough to become more elastic and easier to work with. However, the exact time may vary depending on the temperature and humidity of your environment. If the temperature is too cold, the dough may take longer to rise, while if it is too warm, it may rise too quickly. It’s important to monitor the dough’s rise and adjust the time accordingly.\n\nIt’s also important to note that the dough should be placed in a warm, draft-free environment, such as a warm oven or a bowl covered with a towel. Direct sunlight or a warm spot near a heating vent should be avoided, as these can cause the dough to over-rise or burn.", "task_type": "llava_sft"} {"image": ["/data/data3/hangzeli/work_lab/workspace/multimodal/mPLUG-Owl/OwlEval/cases/7.jpg"], "text": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman:\nHuman:What is the total amount?\nAI:The total amount is $ 3.99.", "task_type": "llava_sft"}。

detail_23k_instruct_image82.txt

llava_instruct_150k_image2_struct5b.txt

minigpt4_13b_answer_mplug11.txt

minigpt4_13b_answer_mplug12.txt

minigpt4_13b_answer_mplug14.txt

hangzeli08 commented 1 year ago

这些都是 loss为nan的数据。应该是jsonl结尾的，但是github不支持这种文件格式上传，所以改成了txt

MAGAer13 commented 1 year ago

Your data are totally incorrect. First, you did not provide <image> token for image input placeholder. Meanwhile, you should use the following prompt The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman: <image>\nHuman: {YOUR PROMPT}\nAI: {YOUR RESPONSE}.

You should add a space between "AI:" and "Response"!

MAGAer13 commented 1 year ago

Please refer to https://github.com/X-PLUG/mPLUG-Owl/blob/main/pipeline/data_utils/xgpt3_dataset.py#L141

It is not problem related to the code.

MAGAer13 commented 1 year ago

Please read the dataset to know how data will be process before fine-tuning your model. @hangzeli08

hangzeli08 commented 1 year ago

好的，我不应该过早下结论，我试一下

hangzeli08 commented 1 year ago

it works! 好使了，loss不为nan了，loss是1.99 并逐渐减少。的确是AI那里少了个空格。我当时其实把xgpt3_dataset.py 那里的数据变量打印出来了，当时竟然没觉得哪里不对。这个bug调试了好几天，不应该妄下结论。终于work了。thank you。

Junxiao-Ma commented 8 months ago

After read this issue, I have a insterest question: why the loss will be nan without a space between "AI:" and "Response"

X-PLUG / mPLUG-Owl

nan loss #93