Error during training - Githubissues

vinevix commented 1 year ago

Hi, I got the following error during training with train.py:

Traceback (most recent call last): File "/home/v.silvio/diffusion-image-captioning-main/Paper2/DDCap-main/ImageCaptioning.pytorch/tools/train.py", line 296, in train(opt) File "/home/v.silvio/diffusion-image-captioning-main/Paper2/DDCap-main/ImageCaptioning.pytorch/tools/train.py", line 174, in train data = loader.get_batch('train') File "/home/v.silvio/diffusion-image-captioning-main/Paper2/DDCap-main/ImageCaptioning.pytorch/captioning/data/dataloader.py", line 334, in get_batch data = next(self.iters[split]) File "/home/v.silvio/.conda/envs/DDCap/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/home/v.silvio/.conda/envs/DDCap/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/home/v.silvio/.conda/envs/DDCap/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/home/v.silvio/.conda/envs/DDCap/lib/python3.9/site-packages/torch/_utils.py", line 425, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/v.silvio/.conda/envs/DDCap/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/home/v.silvio/.conda/envs/DDCap/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/home/v.silvio/diffusion-image-captioning-main/Paper2/DDCap-main/ImageCaptioning.pytorch/captioning/data/dataloader.py", line 236, in collate_func data['fc_feats'] = np.stack(fc_batch) File "<__array_function__ internals>", line 180, in stack File "/home/v.silvio/.conda/envs/DDCap/lib/python3.9/site-packages/numpy/core/shape_base.py", line 426, in stack raise ValueError('all input arrays must have the same shape') ValueError: all input arrays must have the same shape

Any idea about how can I fix it?

ruotianluo commented 1 year ago

Are you able to print the size of the tensors in fc_batch?

vinevix commented 1 year ago

I fixed that problem but I got ‘NameError: name 'COCOEvalCap' is not defined’ during evaluation, I can’t import COCO and COCOEvalCap, even if I downloaded gitmodules and did the setup through README files.

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows

From: Ruotian(RT) @.> Sent: lunedì 30 gennaio 2023 17:19 To: @.> Cc: Vincenzo @.>; @.> Subject: Re: [ruotianluo/ImageCaptioning.pytorch] Error during training (Issue #172)

Are you able to print the size of the tensors in fc_batch?

— Reply to this email directly, view it on GitHubhttps://github.com/ruotianluo/ImageCaptioning.pytorch/issues/172#issuecomment-1408926375, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APCPMZCMOJZC4AGQKFVZLZLWU7SZFANCNFSM6AAAAAAULDI7JI. You are receiving this because you authored the thread.Message ID: @.***>

Huanyu2019 commented 1 year ago

@vinevix Helklo，did you solve this 'NameError: name 'COCOEvalCap' is not defined' yet? I still get the error after reinstalling.

ruotianluo commented 1 year ago

Sync to master and you shall get better warning message.

Huanyu2019 commented 1 year ago

Thank you very much, after I reinstalled coco-caption, the code can be evaluated normally.

@.***

From: Ruotian(RT) Luo Date: 2023-02-20 11:13 To: ruotianluo/ImageCaptioning.pytorch CC: Huanyu2019; Comment Subject: Re: [ruotianluo/ImageCaptioning.pytorch] Error during training (Issue #172) Sync to master and you shall get better warning message. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

ruotianluo / ImageCaptioning.pytorch

Error during training #172