THUDM / VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
Apache License 2.0
4.1k stars 421 forks source link

请问使用qlora微调时,出现TypeError: string indices must be integers,如何解决? #358

Open GG6Bond opened 6 months ago

GG6Bond commented 6 months ago

[2024-05-18 02:42:21,849] [INFO] [RANK 0] > successfully loaded /root/.sat_models/visualglm-6b/1/mp_rank_00_model_states.pt /opt/conda/lib/python3.8/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn( Traceback (most recent call last): File "finetune_XrayGLM.py", line 206, in training_main(args, model_cls=model, forward_step_function=forward_step, create_dataset_function=create_dataset_function, collate_fn=data_collator) File "/opt/conda/lib/python3.8/site-packages/sat/training/deepspeed_training.py", line 67, in training_main train_data, val_data, test_data = make_loaders(args, hooks['create_dataset_function'], collate_fn=collate_fn) File "/opt/conda/lib/python3.8/site-packages/sat/data_utils/configure_data.py", line 201, in make_loaders train = make_dataset(**data_set_args, args=args, dataset_weights=args.train_data_weights, is_train_data=True) File "/opt/conda/lib/python3.8/site-packages/sat/data_utils/configure_data.py", line 127, in make_dataset_full d = create_dataset_function(p, args) File "finetune_XrayGLM.py", line 172, in create_dataset_function dataset = FewShotDataset(path, image_processor, tokenizer, args) File "finetune_XrayGLM.py", line 129, in init image = processor(Image.open(item['img']).convert('RGB')) TypeError: string indices must be integers

steveyoung30 commented 4 months ago

这个是数据集json格式的问题 外面多了一层映射 FewShotDataset内 改为 for item in data['annotations']即可