pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
84.4k stars 22.73k forks source link

Measure impact of JIT decompositions, reconsider the design #85513

Open zou3519 opened 2 years ago

zou3519 commented 2 years ago

We landed https://github.com/pytorch/pytorch/pull/84976. What it does is:

Unfortunately this has been causing some problems:

Alternatively, it would not be too difficult to directly call the Python (from C++) instead of relying on TorchScript to shepherd the code through. This may be a better design in the long-term, the question is, should we do anything about this issue in the short term.

cc @soulitzer @malfet

zou3519 commented 2 years ago

As it pertains to 1.13:

Not sure if that's significant (10%?) but an easy fix is to lazily load the decomps for jvp the first time forward-mode AD is called

soulitzer commented 2 years ago

That sounds pretty reasonable and seems easy to do indeed

zou3519 commented 2 years ago

https://github.com/pytorch/pytorch/pull/85989 resolved the immediate item for 1.13, though we should still reconsider the design because it uses TorchScript