zhuhaozhe / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
0 stars 1 forks source link

partial support inplace op it TE #2

Closed zhuhaozhe closed 2 years ago

zhuhaozhe commented 2 years ago

Summary

This PR aims to support inplace op it TE partially. This can enable TE fusion for patterns like "at::conv, at::relu". "at::add, at::relu, at::add". We can get better performance by this.

Options

Option 1: Replace the in-place operator with the out-place operator

Option 2: Lower the body in terms of in-place directly

We choose option 1 in this PR, and we can consider option 2 if we observed that in many real-world scenarios, options1 failed.

Implement Details

Te-inplace We extend the behavior of Operator supported check and TryMerge.

In Operator supported check, we will create an outplace node to pass the check. After the check is done, destroy the outplace node. In TryMerge, after passing all checks, we will replace an inplace op with its outplace version.

Whether an inplace op can be replaced safely depends on the behavior of RemoveTensorMutions. 2 cases below will not be replaced.

def fn(a, b):
    return a.relu_() + b.relu()
def fn(a, b):
    c = a + b
    return c.sigmoid().add(c.relu_())

For the unit test