Closed youngfly11 closed 3 years ago
Sorry for the late response due to holiday season. Yes, basically you can follow the adversarial training code in train_vqa_adv.py to get the adversarial pre-training code ready. We also plan to release the pre-training code. Thanks for your reminder. Please stay tuned. We will get this done asap.
Meanwhile, you can also try by yourself. There is no specific things that you need to worry about. Basically, follow the pre-training configuration file of the UNITER code base, and then add the adversarial-training-related hyper-parameters. Hope it helps!
Best, Zhe
Hi,zhe; Thanks for your response. I have a follow-up question. When I run the pretraining code in this VILLA repo, I found the training is very slow by using the default setting (setting the worker=4), the GPU utilization is very slow. When I set the worker=8 or higher, It will raise a problem as following. I am wondering that Do you have the same phenomenon during training? How is your training speed in pretraining?
Thanks for trying our code. Empirically, we did not seem to meet the problem that you mentioned. How low is your GPU utilization?
We ran the pretraining code on our internal Microsoft GPU clusters, and did not observe the low utilization phenomenon. This may be caused by your RAM size or disk speed, or other constraints. When you tried the fine-tuning code, did you also have the same low-utilization problem? Thanks.
Best, Zhe
Hi, zhe;
Thanks for your excellent work. Recently I want to reproduce some results in Villa and conduct pre-training on indomain datasets. I am curious about whether it is possible to mimic the adversarial training codes in train_vqa_adv.py to pretraining stage simply? Is there any specific configuration for adversarial training in pretraining stage?