Hi, I am experimenting with the NIMA implementation for a scientfic project! In your Readme, you say that "The learning rate setting differs from the original paper. I can't seem to get the model to converge with momentum SGD using an lr of 3e-7 for the conv base and 3e-6 for the dense block.". Which settings did you use? The defaults in argparser are set to the 3e-7, 3e-6, so that's why I was wondering!
Hi, I am experimenting with the NIMA implementation for a scientfic project! In your Readme, you say that "The learning rate setting differs from the original paper. I can't seem to get the model to converge with momentum SGD using an lr of 3e-7 for the conv base and 3e-6 for the dense block.". Which settings did you use? The defaults in argparser are set to the 3e-7, 3e-6, so that's why I was wondering!
Thank you :)