Open zjcs opened 6 months ago
the gradient of x and y is wrong in the log, while the right result should be: y.grad: tensor([0.]) -> tensor([1.]) x.grad: tensor([ 0., 4., 8., 12.]) -> tensor([0., 2., 4., 6.])
Taichi is accumulating directly into the gradient tensor. For correct interop behavior with pytorch you need to declare new zeroed gradient tensor and pass them into taichi, and then return those
Relates to #8339 - IMO ideally Taichi should not touch the .grad attribute at all and use someother attribute or method to pass around gradients.
If you are careful you can replace the .grad vector with zeros before the taichi grad kernel call then afterwards restore whatever was in the .grad vector and it works.
Describe the bug A clear and concise description of what the bug is, ideally within 20 words.
Wrong gradient when using taichi autodiff.grad and pytorch autodiff.function together.
To Reproduce Please post a minimal sample code to reproduce the bug. The developer team will put a higher priority on bugs that can be reproduced within 20 lines of code. If you want a prompt reply, please keep the sample code short and representative.
Log/Screenshots Please post the full log of the program (instead of just a few lines around the error message, unless the log is > 1000 lines). This will help us diagnose what's happening. For example:
Additional comments If possible, please also consider attaching the output of command
ti diagnose
. This produces the detailed environment information and hopefully helps us diagnose faster.If you have local commits (e.g. compile fixes before you reproduce the bug), please make sure you first make a PR to fix the build errors and then report the bug.