Closed yinangit closed 3 months ago
Hi, thanks for your interest.
Hi, thanks for your interest.
- I double check the training script, and find that the batch size is not 1152. We use 192 V100 for training, and the batch size for each GPU is 2, so the total batch size is 384.
- We do full fine-tuning.
Thanks for your reply. I would like to ask two more questions:
{N}_{SA}
$ and ${N}_{FA}
$ in the paper?
Question
Great work !