wrap_fx_proxy_cls doesn't cover torch.cuda.Stream

kehuanfeng commented 1 year ago

🐛 Describe the bug

I tried to compile my model and got the following error. The error message is kind of straightforward, as the builder doesn't have the proper variable kind for torch.cuda.Stream. The forward of my model does contain the opeation of explict cuda mem copy, that's why it requires to retrieve cuda stream.

  File "/opt/conda/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 537, in run
    and self.step()
  File "/opt/conda/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 500, in step
    getattr(self, inst.opname)(inst)
  File "/opt/conda/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 306, in wrapper
    return inner_fn(self, inst)
  File "/opt/conda/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 965, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/opt/conda/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 434, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/opt/conda/lib/python3.8/site-packages/torch/_dynamo/variables/torch.py", line 444, in call_function
    tensor_variable = wrap_fx_proxy(
  File "/opt/conda/lib/python3.8/site-packages/torch/_dynamo/variables/builder.py", line 731, in wrap_fx_proxy
    return wrap_fx_proxy_cls(
  File "/opt/conda/lib/python3.8/site-packages/torch/_dynamo/variables/builder.py", line 919, in wrap_fx_proxy_cls
    raise AssertionError(
AssertionError: torch.* op returned non-Tensor Stream call_function <function _get_stream at 0x7f4543a47550>

> /opt/conda/lib/python3.8/site-packages/torch/_dynamo/variables/builder.py(920)wrap_fx_proxy_cls()
-> raise AssertionError(
(Pdb) p example_value
<torch.cuda.Stream device=cuda:0 cuda_stream=0x564ab863ceb0>
(Pdb) proxy.node.target
<function _get_stream at 0x7f8be761c550>
(Pdb) proxy.node.op
'call_function'

Error logs

No response

Minified repro

No response

yanboliang commented 1 year ago

The forward of my model does contain the opeation of explict cuda mem copy, that's why it requires to retrieve cuda stream.

You code indirectly call these ops which doesn't return Tensor. I think it's not hard to support these functions. I can take a look if you have a repro.

kehuanfeng commented 1 year ago

@yanboliang Thanks for the reply. You can use the following mini repo. Additionally, there should be similar request for torch.cuda.device.

import torch
import torch.nn as nn
import torch._dynamo as dynamo
from torch.nn.parallel._functions import _get_stream

class CopyModel(nn.Module):
    def __init__(self):
        super(CopyModel, self).__init__()

    def forward(self, input, device):
        self.copy_stream = torch.cuda.Stream(device)
        with torch.cuda.stream(self.copy_stream):
            return input.cuda()

model = CopyModel()
output = model(torch.rand(1000), 1)
print(output.get_device())

opt_model = dynamo.optimize('eager')(model)
output = opt_model(torch.rand(1000), 1)
print(output.get_device())

kehuanfeng commented 1 year ago

Closing it as it works with pytorch nightly build

pytorch / torchdynamo