pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Other
1.18k stars 485 forks source link

Potential solution to RuntimeError: No such operator fbgemm::jagged_2d_to_dense #3168

Open Jary-lrj opened 1 week ago

Jary-lrj commented 1 week ago

My envs BEFORE:

My envs AFTER:

PS: When I use fbgemm_gpu 0.8.0, there will be another error: AttributeError: '_OpNamespace' 'fbgemm' object has no attribute 'merge_pooled_embeddings'. I have no idea why the later version has such an error.

Hint: If you find similar errors, check your configs by the following order: (1) GPU device: My NVIDIA RTX 4090 can't work with the same config in envs AFTER. It seems only V and A devices can work. (2) pytorch and cuda: If possible, you can try run fbgemm in conda virtual envs instead of docker / bare linux. CUDA 11.8 & 12.1 is recommended. AND USE torch 2.3.0+ NOT 2.1.0. As for libnvidia_ml.so, libtorch.so, no matter you use pip or conda to install torch, they will be installed. (3) version: Try 0.7.0 but not 0.8.0.

q10 commented 1 week ago

Hi @Jary-lrj as of time of writing, fbgemm_gpu 0.7.0 is old and no longer supported. Please consider switching over to 0.8.0.