Embedding_bag operator on GPU

facebookresearch / dlrm

An implementation of a deep learning recommendation model (DLRM)

MIT License

3.71k stars 825 forks source link

Hi @rishucoding.

TensorRT uses its own CUDA kernels and mainly uses ONNX to import models. It doesn't use PyTorch.

It appears that TensorRT currently lacks an embedding bag operator. It's not in TensorRT's ops table nor in ONNX's. Also, the lack of embedding bag support in ONNX was an issue raised previously in this repo and also an issue raised in ONNX's repo.

When TensorRT encounters an unsupported operator, it doesn't automatically find an implementation of it from another source like PyTorch. Instead, one would need to resort to workarounds like manually reimplementing unsupported operations in terms of operations that TensorRT supports.

It may be easier to use TensorRT for just the two MLP components of DLRM, as shown here, than for the entire model.

facebookresearch / dlrm

Embedding_bag operator on GPU #357