Closed roimehrez closed 7 years ago
You write in the paper that you rescale {lambda_l} after 100 epochs.
Thanks
I use 200 epochs for 256p, and fine-tune 20 epochs for 512p and 5 epochs for 1024p.
Yes, the main purpose is to let P0-P5 have similar contributions to the loss function.
You write in the paper that you rescale {lambda_l} after 100 epochs.
Thanks