Hi,
first of all thank you for the great repository and the efficient implementation. Does someone else encounter a NaN loss due to the sum operation in the TransitionUP module of the decoder 5. I removed the two linear layers and the training works fine. Please comment if you solved the issue.
Hi, first of all thank you for the great repository and the efficient implementation. Does someone else encounter a NaN loss due to the sum operation in the TransitionUP module of the decoder 5. I removed the two linear layers and the training works fine. Please comment if you solved the issue.