Open IvanYashchuk opened 2 months ago
Sounds resonable! More context on this here
Another potential fix for this might be the use of Liger kernels. I think that's worth trying but it requires creating a quick and dirty executor. Let's proceed with the remat for now and later I'll come back to try this out
🚀 Feature
Motivation
In the above snippet
t5
is the output of the gelu function and the request is to implement a pass that forces recomputation of the gelu function in the backward pass instead of saving this intermediate tensor.Ongoing PR: https://github.com/Lightning-AI/lightning-thunder/pull/1003.
Implementing gelu recomputation would resolve the OOM error seen in https://github.com/Lightning-AI/lightning-thunder/issues/246.