Closed xiapengchng closed 1 year ago
Thank you for your awsome work! Could you tell me the training details about diffuse heads such as learning rate, batch size, optimizer since i want to reimplement the training pipeline.
Thank you and sorry for late reply!
Learning rate: 5e-5 decreasing below 1e-6 with 0.8 factor. Batch size: 10 per GPU. Optimizer: Adam.
Thank you for your awsome work! Could you tell me the training details about diffuse heads such as learning rate, batch size, optimizer since i want to reimplement the training pipeline.