vis-nlp / UniChart

MIT License
59 stars 9 forks source link

About batch size #14

Closed dahlian00 closed 5 months ago

dahlian00 commented 5 months ago

Thank you for your excellent research!

I tried your fine-tuning code using A-100 4GPU. However, the maximum batch size I can use is 8, as using a larger batch size results in a CUDA memory error.

According to the paper, you can increase the batch size to 24 with V100 4GPU.

I didn’t change your code, but I just want to confirm the batch size you used for fine-tuning.

dahlian00 commented 5 months ago

I noticed it's written as parser.add_argument('--batch-size', type=int, default=2, help='Batch Size for the model'), but looking at the code, it seems to specify the batch size per GPU.

Sorry for asking before taking a closer look at the code.