Lightning-AI / lightning-thunder

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
Apache License 2.0
1.2k stars 80 forks source link

Move `copy` for arg's in-place update right after its last consumer #693

Closed crcrpar closed 3 months ago

crcrpar commented 4 months ago
          Would it make sense to try to move the copy as early as possible by analyzing usage, instead of defaulting to the end, since this will have memory implications?

Maybe we can open an issue for this. Also let's add an issue with the proposal for batchnorm.

Originally posted by @lantiga in https://github.com/Lightning-AI/lightning-thunder/pull/675#pullrequestreview-2147432669

IvanYashchuk commented 4 months ago

I think this reordering happens because in general fusing passes do topological sorting of operations grouping things together and in practice, I don't think this buffer tensor lives long in the program. But it needs to be checked what actually happens.

crcrpar commented 4 months ago

functionalize_inplace_ops puts all the required prims.copy_'s right before the return stmt. Luca and I were thinking about moving each copy right after the last consumption of its destination