stjuliet commented 7 months ago

According to the provided fine-tuning guide, we run the whole fine-tuning process successfully, but an error was reported for data below:

RecursionError: Caught RecursionError in DataLoader worker process 0. Original Traceback (most recent call last): File "/codes/LLaMA2-Accessory/accessory/data/conversation/dataset.py", line 277, in getitem return self.get_item_func(index) File "/codes/LLaMA2-Accessory/accessory/data/conversation/dataset.py", line 268, in get_item_func raise LabelAllZeroError() accessory.data.conversation.dataset.LabelAllZeroError: LabelAllZeroError: None

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/codes/LLaMA2-Accessory/accessory/data/conversation/dataset.py", line 277, in getitem return self.get_item_func(index) File "/codes/LLaMA2-Accessory/accessory/data/conversation/dataset.py", line 268, in get_item_func raise LabelAllZeroError() accessory.data.conversation.dataset.LabelAllZeroError: LabelAllZeroError: None

ChrisLiu6 commented 7 months ago

Generally speaking, it means that for this piece of data, there is nothing for the model to predict. For example, if your max_seq_len is set to 256 but the len of (system prompt + first question) is already 300, then after truncation, there is no token to compute loss on.

Errors raised within get_item_func will cause the current item to be skipped. If only few items raise the error, there is no need to worry about it.

PoTsui99 commented 7 months ago

Just increate the --max_words argument in the training script.

Alpha-VLLM / LLaMA2-Accessory

accessory.data.conversation.dataset.LabelAllZeroError: LabelAllZeroError: None #188

According to the provided fine-tuning guide, we run the whole fine-tuning process successfully, but an error was reported for data below: