Closed AbnerCSZ closed 5 years ago
Could your tell me what is the 100 means?
It was legacy code and removing the multiplier should not affect the training much. However, the learning rate might require some adjustment.
Thank you, by the way, why not add smooth loss in the dense d mode(args.w2 = 0 when mode is d). Have you compared the difference?
Thank you, by the way, why not add smooth loss in the dense d mode(args.w2 = 0 when mode is d). Have you compared the difference?
I believe I did the experiment but did not observe improvement. Please let me know if you observe otherwise.
In the end of model.py
Could your tell me what is the 100 means?