bug:导入vaq_ocr数据，

franztao commented 1 year ago

Flamingo model initialized with 23461888 trainable parameters

╭───────────────────── Traceback (most recent call last) ──────────────────────╮

│ /home/jovyan/taoheng/work/Multimodal-GPT/mmgpt/train/instruction_finetune.py │

│ :460 in │

│ │

│ 457 │

│ 458 │

│ 459 if name == "main": │

│ ❱ 460 │ main() │

│ 461 │

│ │

│ /home/jovyan/taoheng/work/Multimodal-GPT/mmgpt/train/instruction_finetune.py │

│ :178 in main │

│ │

│ 175 │ dataset = build_dataset( │

│ 176 │ │ dataset_config=dataset_config.visual_datasets, │

│ 177 │ │ vis_processor=image_processor, │

│ ❱ 178 │ │ tokenizer=tokenizer, │

│ 179 │ ) │

│ 180 │ train_dataloader = DataLoader( │

│ 181 │ │ dataset, │

│ │

│ /home/jovyan/taoheng/work/Multimodal-GPT/mmgpt/datasets/builder.py:23 in │

│ build_dataset │

│ │

│ 20 def build_dataset(dataset_config, **kwargs): │

│ 21 │ if isinstance(dataset_config, list): │

│ 22 │ │ datasets = [build_dataset(cfg, **kwargs) for cfg in dataset_co │

│ ❱ 23 │ │ return ConcatDataset(datasets) │

│ 24 │ dataset_type = dataset_config.pop("type") │

│ 25 │ sample = dataset_config.pop("sample", -1) │

│ 26 │ if dataset_type == "llava": │

│ │

│ /home/jovyan/taoheng/work/Multimodal-GPT/mmgpt/datasets/vqa_dataset.py:210 │

│ in init │

│ │

│ 207 │

│ 208 class ConcatDataset(ConcatDataset): │

│ 209 │ def init(self, datasets: Iterable[Dataset]) -> None: │

│ ❱ 210 │ │ super().init(datasets) │

│ 211 │ │

│ 212 │ def collater(self, samples): │

│ 213 │ │ # TODO For now only supports datasets with same underlying col │

│ │

│ /opt/conda/lib/python3.7/site-packages/torch/utils/data/dataset.py:222 in │

│ init │

│ │

│ 219 │ def init(self, datasets: Iterable[Dataset]) -> None: │

│ 220 │ │ super(ConcatDataset, self).init() │

│ 221 │ │ self.datasets = list(datasets) │

│ ❱ 222 │ │ assert len(self.datasets) > 0, 'datasets should not be an empt │

│ 223 │ │ for d in self.datasets: │

│ 224 │ │ │ assert not isinstance(d, IterableDataset), "ConcatDataset │

│ 225 │ │ self.cumulative_sizes = self.cumsum(self.datasets) │

╰──────────────────────────────────────────────────────────────────────────────╯

AssertionError: datasets should not be an empty iterable

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 11) of binary: /opt/conda/bin/python

Traceback (most recent call last):

File "/opt/conda/bin/torchrun", line 8, in

sys.exit(main())

File "/opt/conda/lib/python3.7/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper

return f(*args, **kwargs)

File "/opt/conda/lib/python3.7/site-packages/torch/distributed/run.py", line 762, in main

run(args)

File "/opt/conda/lib/python3.7/site-packages/torch/distributed/run.py", line 756, in run

)(*cmd_args)

File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 132, in call

return launch_agent(self._config, self._entrypoint, list(args))

File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 248, in launch_agent

failures=result.failures,

torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

============================================================

/home/jovyan/taoheng/work/Multimodal-GPT/mmgpt/train/instruction_finetune.py FAILED

Failures:

------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2023-06-09_06:35:23 host : multimodalgpt-cjl7d rank : 0 (local_rank: 0) exitcode : 1 (pid: 11) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ Pod Name: multimodalgpt-cjl7d Log platform url: https://kibana.stonewise.cn/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15m,to:now))&_a=(columns:!(kubernetes.pod.name,message),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'7aefc5f0-d790-11ed-835f-bbbd6c0470bc',key:kubernetes.pod.name,negate:!f,params:(query:multimodalgpt-cjl7d),type:phrase),query:(match_phrase:(kubernetes.pod.name:multimodalgpt-cjl7d)))),index:'7aefc5f0-d790-11ed-835f-bbbd6c0470bc',interval:auto,query:(language:kuery,query:''),sort:!(!('@timestamp',desc))) 链接里的json文件跟导入的json文件不匹配 ![image](https://github.com/open-mmlab/Multimodal-GPT/assets/6941820/c4f8b667-0c9a-4a87-aaaa-ecdd80318842) ![image](https://github.com/open-mmlab/Multimodal-GPT/assets/6941820/b60a919e-4fab-4070-98ac-49d4228d4d3a)

mm-assistant[bot] commented 1 year ago

We recommend using English or English & Chinese for issues so that we could have broader discussion.

bqcao commented 1 year ago

I got a similar error. The expected data structure in "OCR_VQA/dataset.json" is different from what I downloaded from the provided OCR_VQA link. Any help how to generate the needed OCR_VQA/dataset.json from the provided link: https://drive.google.com/drive/folders/1_GYPY5UkUy7HIcR0zq3ZCFgeZN7BAfm_?usp=sharing

open-mmlab / Multimodal-GPT

bug:导入vaq_ocr数据， #26