Potential solution to RuntimeError: No such operator fbgemm::jagged_2d_to_dense

My envs BEFORE:

in conda virtual env
NVIDIA Tesla V100
python: 3.10
torch 2.1.0 + cu118
cudnn 8.7.0
fbgemm_gpu 0.7.0

My envs AFTER:

in conda virtual env
NVIDIA Tesla V100
python: 3.10 -> 3.12
torch 2.1.0 + cu118 -> torch 2.3.0 + cu118 (Key)
cudnn 8.7.0
fbgemm_gpu 0.7.0

PS: When I use fbgemm_gpu 0.8.0, there will be another error: AttributeError: '_OpNamespace' 'fbgemm' object has no attribute 'merge_pooled_embeddings'. I have no idea why the later version has such an error.

Hint: If you find similar errors, check your configs by the following order: (1) GPU device: My NVIDIA RTX 4090 can't work with the same config in envs AFTER. It seems only V and A devices can work. (2) pytorch and cuda: If possible, you can try run fbgemm in conda virtual envs instead of docker / bare linux. CUDA 11.8 & 12.1 is recommended. AND USE torch 2.3.0+ NOT 2.1.0. As for libnvidia_ml.so, libtorch.so, no matter you use pip or conda to install torch, they will be installed. (3) version: Try 0.7.0 but not 0.8.0.

pytorch / FBGEMM

Potential solution to RuntimeError: No such operator fbgemm::jagged_2d_to_dense #3168