Open megodoonch opened 1 year ago
My student just reported to me that the version he downloaded a few weeks ago didn't have this problem (but did have this line of code), so I'm thinking this is a bug for the fork instead?
I'm not sure if you can leave it out. However, if that line is the in-place operation that the error complains about, then writing
m[range, g] = m[range, g] - 1.0
should do the trick, see here.
I don't know why this fails in the pull request and not in the earlier experiment of your student. When I get around to reviewing the pull request (hopefully this week), I'll look into it.
Seems all I have to do is make a bug report and I solve the problem myself. All it takes is enough Googling...
The Dockerfile doesn't specify the version of Pytorch, so it uses the latest. The instructions say to use 1.1, so I modified the Dockerfile and now it works.
I didn't realise your solution wasn't still modifying in place, so I didn't try it.
Is there any reason not to use the highest version of everything that doesn't throw errors? Should I stick to the versions in the instructions?
Note: changing to m[range, g] = m[range, g] - 1.0
isn't enough (same error), but using an older version of Pytorch is, probably anything earlier than 1.6.
So this seems to be a similar issue to this one: https://stackoverflow.com/questions/67768535/getting-this-warning-output-0-of-backwardhookfunctionbackward-is-a-view-and-is
Back then it was just a UserWarning, and now it's a RuntimeError.
UserWarning: Output 0 of BackwardHookFunctionBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is deprecated and will be forbidden starting version 1.6. You can remove this warning by cloning the output of the custom Function.
I reopenned and re-tagged this issue because the warning does say that it leads to incorrect gradients. So this line of code might not be doing what it intended.
I get a Runtime error when training on the toy corpus in
example/
:File:
graph_dependency_parser/components/cle.py
, line 85, incle_loss
RuntimeError: Output 0 of SliceBackward0 is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.
If I comment this line out, training goes through. I'm using the version of dependencies in @tsimafeip's Docker fork. The current main branch has the same line of code, though.
It kind of looks like it's not crucial; anyone know if I can just leave it out for now?
Here's my training log: training-error.log