Operators not supported

VirxEC commented 1 year ago

When I run my test model (PyTorch 1.13.0 + SB3) I get this error:

~/Documents/ai_test/venv/lib/python3.10/site-packages/torch/distributions/categorical.py:62: UserWarning: The operator 'aten::amax.out' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at ~/Documents/pytorch_dlprim/src/tensor_ops.cpp:302.)
  self.logits = logits - logits.logsumexp(dim=-1, keepdim=True)
~/Documents/ai_test/venv/lib/python3.10/site-packages/torch/distributions/categorical.py:62: UserWarning: The operator 'aten::_copy_from_and_resize' is not currently supported on the ocl backend. Please open an issue at for requesting support https://github.com/artyom-beilis/pytorch_dlprim/issues (Triggered internally at ~Documents/pytorch_dlprim/src/tensor_ops.cpp:302.)
  self.logits = logits - logits.logsumexp(dim=-1, keepdim=True)

Most notably the operators aten::amax.out and aten::_copy_from_and_resize aren't supported.

artyom-beilis commented 1 year ago

aten::amax.out

This I think can be implemented easily - I suggest look on other examples in the code like max.

aten::_copy_from_and_resize is trickier

VirxEC commented 1 year ago

Cool! I might get around to it, but I've very unfamiliar with this project and its structure so it will take a while before I have anything working/presentable.

VirxEC commented 1 year ago

I should mention this in the error message:

NotImplementedError: Could not run 'aten::_copy_from_and_resize' with arguments from the 'CPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_copy_from_and_resize' is only available for these backends: [PrivateUse1, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMeta, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].

smilealvin92 commented 8 months ago

follow the implementions of MPS backend in pytorch source code, we can copy that in pytorch dlprimitives, adding new operator support for aten::_copy_from_and_resize

artyom-beilis commented 8 months ago

Fixed in be908fd875410df72909b14727ca23a8371b42cf

leviathanch commented 8 months ago

It seems you didn't push that, because I can't find the commit when pulling

artyom-beilis commented 8 months ago

Ohhh you are right: https://github.com/artyom-beilis/pytorch_dlprim/commit/be908fd875410df72909b14727ca23a8371b42cf pushed

smilealvin92 commented 7 months ago

in torch 1.13 version, there is no such problem like "some ops fall back to cpu and cpu does not support such ops", in torch 2.1, I encountered such problems. It seems we support torch 2.1 better now. still not know why in torch 2.1 so much new ops emerge

artyom-beilis / pytorch_dlprim

Operators not supported #38