Hello, author. I have downloaded your code and used ATLAS V2 data, but the final test result is quite different from that of the paper. I have compared the parameter Settings in the code and the paper. The labeled optimization function in the paper is SGD LR = 1E-6, and lr= 1E-4 in the paper. However, there are still some differences after the learning rate here is modified and retrained. Do you have any good suggestions? Or is there anything else to pay attention to besides learning rate?
Hello, author. I have downloaded your code and used ATLAS V2 data, but the final test result is quite different from that of the paper. I have compared the parameter Settings in the code and the paper. The labeled optimization function in the paper is SGD LR = 1E-6, and lr= 1E-4 in the paper. However, there are still some differences after the learning rate here is modified and retrained. Do you have any good suggestions? Or is there anything else to pay attention to besides learning rate?