Open LZH101 opened 7 months ago
Feel free to come and share ideas! @LZH101 I've updated the config file for the single card training, and the weights and accuracies obtained from the corresponding training. We do notice a different accuracy with the 4-card one. The reduced batchsize due to the single card leads to that the pre-training may be more prone to overfitting. Our accuracy at 350 epoch is 85.81, while if we choose 300 epoch weights for linear evaluation, the accuracy is 86.45, both with some decrease from multi-card training. These updated weights and results have been updated for reference. Perhaps adding some regularization or gradient accumulation could mitigate the drop in accuracy due to overfitting, but this part has not been validated.
I don't have any questions and I'm sorry for entering the wrong information.