Open bai-24 opened 1 year ago
Thanks for asking. Where did you get the number of 566435 in the code? btw, every image in COCO has 5 captions. I guess 566435 / 5 = 113287 images which are (train + trainval) images.
Dear Author, ![Uploading image.png…]() The screenshot of the problem is shown in the above figure. The training data volume is 566435, but I don't know what it represents?
Thanks for reporting. May you upload the screenshot again? I can't see your screenshot (it takes a few seconds for a picture to be uploaded on Github).
I still have no idea why it has 566435 iterations. Can you provide me more information about the config.yaml, how many GPUs you used, batch size, etc?
If you use batch_size = 1
, then it may be correct as there are 566435 pairs of (image-caption).
The number of GPUs I used is 1, batch_ size is 4
Then may you check yourself the dataloader, or hard-code the batch_size = 4 in your dataloader. I believe that if batch_size = 4, you will have 566435/4 iterations. Something may be wrong here. If possible, may you send me your fork/code? I will check it tomorrow after I finish my work.
I seem to know the reason, the batch_size I set in the code is 1.Thank you very much for your help.
The number of GPUs I used is 1, batch_ size is 4
How long it takes according to your Settings?
Dear Author, The number of training sets in the COCO dataset is 82783, but the number of training sets in the code is 566435, which will greatly increase training time. Why do we do this?