Closed CrazyBrick closed 10 months ago
Had the same issue today with english data, problems were: \n and also "\ in some cases.
check those.
Had the same issue today with english data, problems were: \n and also "\ in some cases.
check those.
thank you.I checked before,but there is no '\' except '\n'. I print some vars
for i in range(num_images + 1):
cur_new_input_embeds.append(cur_input_embeds_no_im[i])
cur_new_labels.append(cur_labels_noim[i])
if i < num_images:
print(f"Batch Index: {batch_idx}\n, Current Image Index: {cur_image_idx}\n, Num Images: {num_images}")
in fileLLaVA/llava/model/llava_arch.py
,line 178.
when it runs normally, Num Images: 1
when it runs with error, Num Images: 4
suggest to make sure one sample is right at https://github.com/haotian-liu/LLaVA/blob/main/llava/train/train.py#L402 (for example, based on which conversation template you choose) if the input_ids and labels/targets are as expected, then there will be no error in the later processes. @CrazyBrick
Had the same issue today with english data, problems were: \n and also "\ in some cases. check those.
thank you.I checked before,but there is no '' except '\n'. I print some vars
for i in range(num_images + 1): cur_new_input_embeds.append(cur_input_embeds_no_im[i]) cur_new_labels.append(cur_labels_noim[i]) if i < num_images: print(f"Batch Index: {batch_idx}\n, Current Image Index: {cur_image_idx}\n, Num Images: {num_images}")
in file
LLaVA/llava/model/llava_arch.py
,line 178. when it runs normally,Num Images: 1
when it runs with error,Num Images: 4
@CrazyBrick Based on your message, I think you have similar problems like me, and it is because the data preprocessing is wrong. You should check that for every instance in your data list, mostly in multi turn conversations, the tag \<image> should appear only once.
Had the same issue today with english data, problems were: \n and also "\ in some cases. check those.
thank you.I checked before,but there is no '' except '\n'. I print some vars
for i in range(num_images + 1): cur_new_input_embeds.append(cur_input_embeds_no_im[i]) cur_new_labels.append(cur_labels_noim[i]) if i < num_images: print(f"Batch Index: {batch_idx}\n, Current Image Index: {cur_image_idx}\n, Num Images: {num_images}")
in file
LLaVA/llava/model/llava_arch.py
,line 178. when it runs normally,Num Images: 1
when it runs with error,Num Images: 4
@CrazyBrick Based on your message, I think you have similar problems like me, and it is because the data preprocessing is wrong. You should check that for every instance in your data list, mostly in multi turn conversations, the tag
should appear only once.
thx, your guess is correct, I do have multi-turns of dialogue and multipleIndexError: index 16 is out of bounds for dimension 0 with size 16.
. I have no clue...
@CrazyBrick That is weird, maybe you should find a way to print out the actual instance that cause this error. Because it loads the images and texts seperately, which means the length of image_features is fixed (e.g. the length of image_features is 16 in your case) before dealing with the texts. And in one batch, the cur_image_idx should only increase in two cases, first is that there is no image in this instace, second is that there is only one \<image> in the instance which makes the cur_image_idx increases only once in one instance, so if there are two or more tags in the instance, the cur_image_idx would increase twice or more and then exceed the bound of already fixed image_features.
@CrazyBrick That is weird, maybe you should find a way to print out the actual instance that cause this error. Because it loads the images and texts seperately, which means the length of image_features is fixed (e.g. the length of image_features is 16 in your case) before dealing with the texts. And in one batch, the cur_image_idx should only increase in two cases, first is that there is no image in this instace, second is that there is only one
in the instance which makes the cur_image_idx increases only once in one instance, so if there are two or more tags in the instance, the cur_image_idx would increase twice or more and then exceed the bound of already fixed image_features.
thanks @henrycjh quite weird, I debug for a while but I didn't find anything strange.But after regenerating the custom dataset for multi-tunes of dialogue with 1 tag, it seems like there are no errors. I don't understand, but in the end, it can run.
But I have to admit that my fine-tuning did not achieve the expected effect and seems to have no effect. I don't know if it's due to too little data or other parameter reasons.
@CrazyBrick Glad that you make it work. I guess the main reason that fine-tuning has no effect is that the LLM is pretrained mainly on English data, so it may still have no effect even you have a lot of Chinese data during fine-tuing stage.
@CrazyBrick Glad that you make it work. I guess the main reason that fine-tuning has no effect is that the LLM is pretrained mainly on English data, so it may still have no effect even you have a lot of Chinese data during fine-tuing stage.
@henrycjh thank you for your help, I selected a portion from the dataset to generate English data, but the prediction is quite poor. I am confused about how to achieve effective finetune, how many constraints are needed, and what kind of expectations can be achieved I've proposed a new issue #884
@CrazyBrick Glad that you make it work. I guess the main reason that fine-tuning has no effect is that the LLM is pretrained mainly on English data, so it may still have no effect even you have a lot of Chinese data during fine-tuing stage.
@henrycjh thank you for your help, I selected a portion from the dataset to generate English data, but the prediction is quite poor. I am confused about how to achieve effective finetune, how many constraints are needed, and what kind of expectations can be achieved I've proposed a new issue #884
Did you do the 2 stage training from zero to one? In that case, intervening from stage 1 with Chinese data and mixing in some in stage 2 might work much better. That's what I did myself, and my Chinese skills still improved visibly
@CrazyBrick Glad that you make it work. I guess the main reason that fine-tuning has no effect is that the LLM is pretrained mainly on English data, so it may still have no effect even you have a lot of Chinese data during fine-tuing stage.
@henrycjh thank you for your help, I selected a portion from the dataset to generate English data, but the prediction is quite poor. I am confused about how to achieve effective finetune, how many constraints are needed, and what kind of expectations can be achieved I've proposed a new issue #884
I'm assuming you're translating the Chinese data from English? That's what I did, so I ran into a situation where multiple
Describe the issue
Issue: There is no problem fine-tuning using the provided dataset, but using local Chinese data will report this error, which is the same as the issue 134 issue, but it has been closed.
Command:
finetune_lora.sh Log:
I've tried to delete all explicit '\n\' in the json file.For example:
"value": "<image>\n图中的人是xxx,请以这个人为主体描述一下图中内容"
-->"value": "<image>图中的人是xxx,请以这个人为主体描述一下图中内容"
but it doen't work.