Hello, I recently studied your Adaface Loss function and your code framework. Is there anything special about the adjustment of Learning rate in your framework? Using lr_ The scheduler. MultiStepLR() method also lacks warmup. Your explanation will help me better understand!
Hello, I recently studied your Adaface Loss function and your code framework. Is there anything special about the adjustment of Learning rate in your framework? Using lr_ The scheduler. MultiStepLR() method also lacks warmup. Your explanation will help me better understand!