pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
82.3k stars 22.13k forks source link

No documentation to show how to implement aten::view for custom backend #99143

Open ghostplant opened 1 year ago

ghostplant commented 1 year ago

📚 The doc issue

The original code is:

  x = torch.empty([1024], device='privateuseone:0')
  y = x.view([2, -1]) # raise error by missing aten::view

Then I get following errors:

NotImplementedError: Could not run 'aten::view' with arguments from the 'PrivateUse1' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::view' is only available for ..

According to some interface declaration in Pytorch source code, the extension looks like this:

static at::Tensor __view(c10::DispatchKeySet ks, const at::Tensor & self, c10::SymIntArrayRef size) {
  return at::_ops::view::redispatch(ks, self, size);
}
TORCH_LIBRARY_IMPL(aten, Antares, m) {
  m.impl("view", __view);
}

However, it results in infinite recursive call of this function and ends with stack overflow. I don't think x.view([2, -1]) really requires user to define its implementation. If this definition is a must, what documentation can I refer to get it passed correctly?

Suggest a potential alternative/fix

An document example of how to implement custom aten::view, or any simpler solutions to solve the reshape problem above.

cc @malfet @zou3519 @svekars @carljparker

ttrouwen-dmatrix commented 5 months ago

Did you manage to implement this?

Would be great to have an example of this in open_registration_extension.cpp

ttrouwen-dmatrix commented 5 months ago

The DLPrimitives for PyTorch repo has an example in the tensor_ops.cpp file. Seems like the implementation should be identical for the example in open_registration_extension.cpp (it just creates an alias and contains some logic to infer the size if -1 is given).

If you wanted to implement the view operation such that you could print tensors (like me) and it cannot fallback to CPU you will run into another issue where a _copy_from operation will fail because src and dst vectors have different sizes. Seems like it will require a bit of work to get that working.

ghostplant commented 5 months ago

Thanks, I stared it!