Open Guopeng1019 opened 4 years ago
I don't know why, I checked my codes and didn't find the bugs, maybe you can help me. But the addernet's grad transform is very difficult, the |W+X|'s grad is 1 or -1, it's too small and difficult to transform to the deeper layers.
used your training code, the accuracy is about 50%,but the paper's accuracy has achieved 90%,could you please tell me why?
I have met the similar question.On my dataset,the accuracy is 100% at the very beginning.
used your training code, the accuracy is about 50%,but the paper's accuracy has achieved 90%,could you please tell me why?