I've tried the convert_sqa_to_llava.py by this command:
python convert_sqa_to_llava.py --task convert_to_jsonl --base_dir /data/LM/datasets/ScienceQA/ --split train
and get this file: /data/LM/datasets/scienceqa_train_QCM-LEPA.jsonl.
But when I change the "llava_train_QCM-LEPA.json" to "/data/LM/datasets/scienceqa_train_QCM-LEPA.jsonl" in the finetune.sh scripts. It doesn't work as follow error:
Traceback (most recent call last):
File "/data/zhanghao/github/LLaVA-main/llava/train/train_mem.py", line 5, in <module>
train(attn_implementation="eager")
File "/data/zhanghao/github/LLaVA-main/llava/train/train.py", line 959, in train
data_module = make_supervised_data_module(tokenizer=tokenizer,
File "/data/zhanghao/github/LLaVA-main/llava/train/train.py", line 779, in make_supervised_data_module
train_dataset = LazySupervisedDataset(tokenizer=tokenizer,
File "/data/zhanghao/github/LLaVA-main/llava/train/train.py", line 665, in __init__
list_data_dict = json.load(open(data_path, "r"))
File "/login_home/zhanghao/anaconda3/envs/tx8quant/lib/python3.10/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/login_home/zhanghao/anaconda3/envs/tx8quant/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/login_home/zhanghao/anaconda3/envs/tx8quant/lib/python3.10/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 710)
Question
I've tried the convert_sqa_to_llava.py by this command:
python convert_sqa_to_llava.py --task convert_to_jsonl --base_dir /data/LM/datasets/ScienceQA/ --split train
and get this file: /data/LM/datasets/scienceqa_train_QCM-LEPA.jsonl.
But when I change the "llava_train_QCM-LEPA.json" to "/data/LM/datasets/scienceqa_train_QCM-LEPA.jsonl" in the finetune.sh scripts. It doesn't work as follow error: