namisan / mt-dnn

Multi-Task Deep Neural Networks for Natural Language Understanding
MIT License
2.22k stars 412 forks source link

An issue about how to make a perturbation in perturbation.py #214

Open TAYAyuki opened 3 years ago

TAYAyuki commented 3 years ago

Hi, I use SMART. In the SMART paper, the gradient is divided by the infinity norm of the gradient first, but in the perturbation.py, the gradient is multiplied by the step_size first, and then the sum of the initial noise and the gradient is divided by the infinity norm of it. According to the algorithm on the paper, I think that the only gradient is divided by the infinity norm of the gradient, and then the initial noise should be added to it Could you let me know which is correct?