artyom-beilis / pytorch_dlprim

DLPrimitives/OpenCL out of tree backend for pytorch
http://blog.dlprimitives.org/
MIT License
264 stars 17 forks source link

torch.embedding NotImplementedError: Could not run 'aten::index_select' with arguments from the 'ocl' backend. #17

Open zougloub opened 1 year ago

zougloub commented 1 year ago

Every now and then, I'm checking https://github.com/pytorch/pytorch/issues/488, usually in despair.

But then I saw @artyom-beilis's pytorch_dlprim, which is hooking into the privateuseone backend to FINALLY provide support of generic OpenCL hardware.

As I was trying it out on a model, I got this error:

torch/nn/functional.py:2210 in embedding
2207 │   │   #   torch.embedding_renorm_
2208 │   │   # remove once script supports set_grad_enabled
2209 │   │   _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
❱ 2210 │   return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)

NotImplementedError: Could not run 'aten::index_select' with arguments from the 'ocl' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build).
artyom-beilis commented 1 year ago

It is likely one of operators not implemented yet.

Hand't get into NLP area yet. From what I remember embedding is some initital lightweight stage and usually is not backpropogated? Am I right?

Can you try to run this part on CPU?

zougloub commented 1 year ago

I can run it on CPU, yeah.

index_select is a conceptually "easy" operation but implementing it efficiently is not (cf. aten/src/ATen/native/cuda/Indexing.cu / index_select_out_cuda_impl()).

artyom-beilis commented 1 year ago

index_select is a conceptually "easy" operation but implementing it efficiently is not (cf. aten/src/ATen/native/cuda/Indexing.cu / index_select_out_cuda_impl()).

I'll take a look on it.

Can you get simple example of how do you use it - I just not that familiar with it yet :-)

Can