LukeForeverYoung / UReader

Apache License 2.0
102 stars 6 forks source link

Training and benchmark results on V100 #10

Open yuyq96 opened 7 months ago

yuyq96 commented 7 months ago

Thank you for open sourcing the data and code for UReader. I used scripts/train_it_v100.sh to train UReader. However, I was unable to reproduce the benchmark results.

Pretrained checkpoint: MAGAer13/mplug-owl-llama-7b

Training loss curve: train_loss

Benchmark results: DocVQA InfoVQA DeepForm KLC WTQ TabFact ChartQA
Official 65.4 42.2 49.5 32.8 29.4 67.6 59.3
Replication on V100 50.6 32.1 21.8 28.1 19.5 64.2 46.0

I notice that the micro batch size settings are different on A100 and V100, and it leads to different reduced losses and might affect the training. Other differences between the script and paper include:

@LukeForeverYoung Have you tried completing the training on V100? Could you please verify the loss curve and these results? Thanks!

bellos1203 commented 2 months ago

Hey @yuyq96, I got similar results. Have you resolved the issue?

yuyq96 commented 2 months ago

Hey @yuyq96, I got similar results. Have you resolved the issue?

Unfortunately, I wasn't able to replicate the results on V100 using the official settings, and I don't have access to an A100 either. We've been experimenting with quite different training settings in our model.