facebookresearch / FBTT-Embedding

This is a Tensor Train based compression library to compress sparse embedding tables used in large-scale machine learning models such as recommendation and natural language processing. We showed this library can reduce the total model size by up to 100x in Facebook’s open sourced DLRM model while achieving same model quality. Our implementation is faster than the state-of-the-art implementations. Existing the state-of-the-art library also decompresses the whole embedding tables on the fly therefore they do not provide memory reduction during runtime of the training. Our library decompresses only the requested rows therefore can provide 10,000 times memory footprint reduction per embedding table. The library also includes a software cache to store a portion of the entries in the table in decompressed format for faster lookup and process.
MIT License
192 stars 27 forks source link

Installation failure with Torch 1.7 #22

Open chongxiaoc opened 2 years ago

chongxiaoc commented 2 years ago

It looks like torch 1.7 moves the location of header file ATen/cuda/CUDAGeneratorImpl.h Error:

[root@~/FBTT-Embedding #]/usr/local/cuda/bin/nvcc -I/usr/lib/python3.6/site-packages/torch/include -I/usr/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/usr/lib/python3.6/site-packages/torch/include/TH -I/usr/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c tt_embeddings_cuda.cu -o build/temp.linux-x86_64-3.6/tt_embeddings_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -g --expt-relaxed-constexpr -D__CUDA_NO_HALF_OPERATORS__ -I/usr/lib/cub-1.8.0 -gencode=arch=compute_70,code="sm_70" -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=tt_embeddings -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
tt_embeddings_cuda.cu:13:41: fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory
 #include <ATen/cuda/CUDAGeneratorImpl.h>
                                         ^
compilation terminated.
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

If I changed the location as below:

-#include <ATen/cuda/CUDAGeneratorImpl.h>
+#include <ATen/CUDAGeneratorImpl.h>

Installation completes.

bilgeacun commented 2 years ago

Thanks for the fix. Do you mind sending a pull request with the change?

chongxiaoc commented 2 years ago

@bilgeacun I would like to. But will this cause failure of building on Torch 1.6? I assume you guys have tested torch 1.6 with existing code.

bilgeacun commented 2 years ago

We can bump the supported version to 1.7. Do you see any other problems with 1.7?

chongxiaoc commented 2 years ago

I run the benchmark on torch 1.7 and CUDA 11.2. It works so far. Let me create a PR then.

chongxiaoc commented 2 years ago

PR drafted: https://github.com/facebookresearch/FBTT-Embedding/pull/23

bilgeacun commented 2 years ago

Let me test DLRM to make sure that works as well.

bilgeacun commented 2 years ago

Actually I realized move of CUDAGeneratorImpl.h to ATen/cuda happened recently to support PT 1.11. So your change would be reverting that. See: https://github.com/pytorch/pytorch/pull/70650

Can you try PT 1.11?

chongxiaoc commented 2 years ago

I see. But our production system doesn't support PT 1.11 by now. I can close that PR then. Thanks for clarification.