[X] I have read the README and searched the existing issues.
System Info
Generating train split: 0 examples [00:00, ? examples/s]Failed to convert pandas Da[62/1867]
o Arrow Table from file '/data/zhaopengfeng/LLaMA-Factory/data/kddcup/openai_track4_0702.jso
n' with error <class 'pyarrow.lib.ArrowInvalid'>: ('cannot mix list and non-list, non-null v
alues', 'Conversion failed for column messages with type object')
Generating train split: 0 examples [00:00, ? examples/s]
[rank3]: Traceback (most recent call last):
[rank3]: File "/home/zhaopengfeng/anaconda3/envs/llama_factory/lib/python3.10/site-package
s/datasets/builder.py", line 1997, in _prepare_split_single
[rank3]: for _, table in generator:
[rank3]: File "/home/zhaopengfeng/anaconda3/envs/llama_factory/lib/python3.10/site-package
s/datasets/packaged_modules/json/json.py", line 165, in _generate_tables
[rank3]: raise ValueError(
[rank3]: ValueError: Failed to convert pandas DataFrame to Arrow Table from file /data/zhaop
engfeng/LLaMA-Factory/data/kddcup/openai_track4_0702.json.
[rank3]: The above exception was the direct cause of the following exception:
[rank3]: Traceback (most recent call last):
[rank3]: File "/data/zhaopengfeng/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in
<module>
[rank3]: launch()
[rank3]: File "/data/zhaopengfeng/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in
launch
[rank3]: run_exp()
[rank3]: File "/data/zhaopengfeng/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50,
in run_exp
[rank3]: run_sft(model_args, data_args, training_args, finetuning_args, generating_args,
callbacks)
[rank3]: File "/data/zhaopengfeng/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", l
ine 48, in run_sft
[rank3]: dataset = get_dataset(model_args, data_args, training_args, stage="sft", **toke
nizer_module)
[rank3]: File "/data/zhaopengfeng/LLaMA-Factory/src/llamafactory/data/loader.py", line 174
, in get_dataset
[rank3]: all_datasets.append(load_single_dataset(dataset_attr, model_args, data_args, tr
aining_args))
[rank3]: File "/data/zhaopengfeng/LLaMA-Factory/src/llamafactory/data/loader.py", line 109
, in load_single_dataset
[rank3]: dataset = load_dataset(
[rank3]: File "/home/zhaopengfeng/anaconda3/envs/llama_factory/lib/python3.10/site-package
s/datasets/load.py", line 2616, in load_dataset
[rank3]: builder_instance.download_and_prepare(
[rank3]: File "/home/zhaopengfeng/anaconda3/envs/llama_factory/lib/python3.10/site-package
s/datasets/builder.py", line 1029, in download_and_prepare
[rank3]: self._download_and_prepare(
[rank3]: File "/home/zhaopengfeng/anaconda3/envs/llama_factory/lib/python3.10/site-package
s/datasets/builder.py", line 1124, in _download_and_prepare
[rank3]: self._prepare_split(split_generator, **prepare_split_kwargs)
[rank3]: File "/home/zhaopengfeng/anaconda3/envs/llama_factory/lib/python3.10/site-package
s/datasets/builder.py", line 1884, in _prepare_split
[rank3]: for job_id, done, content in self._prepare_split_single(
[rank3]: File "/home/zhaopengfeng/anaconda3/envs/llama_factory/lib/python3.10/site-package
s/datasets/builder.py", line 2040, in _prepare_split_single
[rank3]: raise DatasetGenerationError("An error occurred while generating the dataset")
from e...
[
{
"messages": [
{
"role": "system",
"content": "You are a helpful online shopping assistant.Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n"
},
{
"role": "user",
"content": "Instructions: Explain the product category name\nInput: iPhone\n"
},
{
"role": "assistant",
"content": "An iPhone is a smartphone designed and manufactured by Apple Inc. that combines mobile phone capabilities with features such as internet access, multimedia playback, and a touch screen interface."
}
]
},
{
"messages": [
{
"role": "system",
"content": "You are a helpful online shopping assistant.Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n"
},
{
"role": "user",
"content": "Input: Explain the product type Serving Bowl\n"
},
{
"role": "assistant",
"content": "A serving bowl is a large bowl used for presenting and serving food, such as salads, vegetables, or side dishes, at the dining table.\n"
},
{
"role": "user",
"content": "Input: Explain the product type Apron\n"
},
{
"role": "assistant",
"content": "An apron is a protective garment worn over clothing to shield the wearer from spills, stains, and other kitchen or work-related messes. Aprons are commonly used in cooking, baking, and various crafts or occupations."
}
]
},...
]
Reminder
System Info
Reproduction
Dataset info
Data example
train script
My environment
BTW, the alpaca template works well.
Expected behavior
lora sft
Others
No response