Closed lkfo415579 closed 2 years ago
Yes. I am working on future model usage therefore I noticed the bug in here. when I modifying the edge feature, there may be 0 embedding index which exists in the edge feature before the range of mask. (e.g. no covalent bond between two atoms)
If I understand correctly, it might be solved by adding an index-shift.
If I understand correctly, it might be solved by adding an index-shift.
I don't understand what is index-shift, haha.
Good point. The padding token will be non-zero, but it won't affect the self-attention calculation since the padding attention bias (see here). If you have concern about the potential influence of future model usage, you could modify the initialization refer to here.
This method solved the problem anyway.
Close this issue due to inactive. Feel free to raise a new one or reopen this one for any further question.
https://github.com/microsoft/Graphormer/blob/740e6ff09a5de29d61def5ea6af7dfd04cee719e/graphormer/model.py#L20
When you re-initialize the weight of embedding, the weight of 0th index is also initialized by normal distribution, whose padding vector in the feature input will be non-zero. It should be wrong.