AnswerDotAI / RAGatouille

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
Apache License 2.0
3.07k stars 210 forks source link

Stuck on "Loading packbits_cpp extension" #245

Closed TimKoornstra closed 3 months ago

TimKoornstra commented 3 months ago

My RAG.search and RAG.index are seemingly stuck on loading the packbits_cpp extension. I had it working before with the same index and creating now ones, but then I think my nvidia driver updated, and now it's stuck here. If I interrupt my kernel, I see that Torch baton is blocking:

File ~/miniconda3/envs/llama/lib/python3.11/site-packages/colbert/indexing/codecs/residual.py:118, in ResidualCodec.try_load_torch_extensions(cls, use_gpu)
    115 cls.decompress_residuals = decompress_residuals_cpp.decompress_residuals_cpp
    117 print_message(f"Loading packbits_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...")
--> 118 packbits_cpp = load(
    119     name="packbits_cpp",
    120     sources=[
    121         os.path.join(
    122             pathlib.Path(__file__).parent.resolve(), "packbits.cpp"
    123         ),
    124         os.path.join(
    125             pathlib.Path(__file__).parent.resolve(), "packbits.cu"
    126         ),
    127     ],
    128     verbose=os.getenv("COLBERT_LOAD_TORCH_EXTENSION_VERBOSE", "False") == "True",
    129 )
    130 cls.packbits = packbits_cpp.packbits_cpp
    132 cls.loaded_extensions = True

File ~/miniconda3/envs/llama/lib/python3.11/site-packages/torch/utils/cpp_extension.py:1312, in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1220 def load(name,
   1221          sources: Union[str, List[str]],
   1222          extra_cflags=None,
   (...)
   1230          is_standalone=False,
   1231          keep_intermediates=True):
   1232     """
   1233     Load a PyTorch C++ extension just-in-time (JIT).
   1234 
   (...)
   1310         ...     verbose=True)
   1311     """
-> 1312     return _jit_compile(
   1313         name,
   1314         [sources] if isinstance(sources, str) else sources,
   1315         extra_cflags,
   1316         extra_cuda_cflags,
   1317         extra_ldflags,
   1318         extra_include_paths,
   1319         build_directory or _get_build_directory(name, verbose),
   1320         verbose,
   1321         with_cuda,
   1322         is_python_module,
   1323         is_standalone,
   1324         keep_intermediates=keep_intermediates)

File ~/miniconda3/envs/llama/lib/python3.11/site-packages/torch/utils/cpp_extension.py:1739, in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1737         baton.release()
   1738 else:
-> 1739     baton.wait()
   1741 if verbose:
   1742     print(f'Loading extension module {name}...', file=sys.stderr)

File ~/miniconda3/envs/llama/lib/python3.11/site-packages/torch/utils/file_baton.py:43, in FileBaton.wait(self)
     36 """
     37 Periodically sleeps for a certain amount until the baton is released.
     38 
     39 The amount of time slept depends on the ``wait_seconds`` parameter
     40 passed to the constructor.
     41 """
     42 while os.path.exists(self.lock_file_path):
---> 43     time.sleep(self.wait_seconds)

My pip freeze (after clean install):

aiohappyeyeballs==2.4.0
aiohttp==3.10.5
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.4.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
attrs==24.2.0
babel==2.16.0
beautifulsoup4==4.12.3
bitarray==2.9.2
bleach==6.1.0
blinker==1.8.2
catalogue==2.0.10
certifi==2024.7.4
cffi==1.17.0
charset-normalizer==3.3.2
click==8.1.7
colbert-ai==0.2.19
comm==0.2.2
dataclasses-json==0.6.7
datasets==2.21.0
debugpy==1.8.5
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.14
dill==0.3.8
dirtyjson==1.0.8
distro==1.9.0
einops==0.8.0
executing==2.0.1
faiss==1.8.0
faiss-cpu==1.8.0.post1
fast-pytorch-kmeans==0.2.0.1
fastjsonschema==2.20.0
filelock==3.15.4
flash-attn==2.6.3
Flask==3.0.3
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2024.6.1
git-python==1.0.3
gitdb==4.0.11
GitPython==3.1.43
greenlet==3.0.3
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
huggingface-hub==0.24.6
idna==3.7
ipykernel==6.29.5
ipython==8.26.0
ipywidgets==8.1.3
isoduration==20.11.0
itsdangerous==2.2.0
jedi==0.19.1
Jinja2==3.1.4
jiter==0.5.0
joblib==1.4.2
json5==0.9.25
jsonpatch==1.33
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.2
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.4
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==3.0.11
langchain==0.2.14
langchain-community==0.2.12
langchain-core==0.2.33
langchain-text-splitters==0.2.2
langsmith==0.1.99
llama-cloud==0.0.13
llama-index==0.10.67.post1
llama-index-agent-openai==0.2.9
llama-index-cli==0.1.13
llama-index-core==0.10.67
llama-index-embeddings-openai==0.1.11
llama-index-indices-managed-llama-cloud==0.2.7
llama-index-legacy==0.9.48.post3
llama-index-llms-openai==0.1.29
llama-index-multi-modal-llms-openai==0.1.9
llama-index-program-openai==0.1.7
llama-index-question-gen-openai==0.1.3
llama-index-readers-file==0.1.33
llama-index-readers-llama-parse==0.1.6
llama-parse==0.4.9
MarkupSafe==2.1.5
marshmallow==3.21.3
matplotlib-inline==0.1.7
mistune==3.0.2
mkl-fft @ file:///croot/mkl_fft_1695058164594/work
mkl-random @ file:///croot/mkl_random_1695059800811/work
mkl-service==2.4.0
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.16
mypy-extensions==1.0.0
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
networkx==3.3
ninja==1.11.1.1
nltk==3.9.1
notebook==7.2.1
notebook_shim==0.2.4
numpy @ file:///croot/numpy_and_numpy_base_1708638617955/work/dist/numpy-1.26.4-cp311-cp311-linux_x86_64.whl#sha256=5f96f274d410a1682519282ae769c877d32fdbf171aa8badec7bf5e1d3a1748a
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.6.20
nvidia-nvtx-cu12==12.1.105
onnx==1.16.2
openai==1.41.1
orjson==3.10.7
overrides==7.7.0
packaging @ file:///croot/packaging_1720101850331/work
pandas==2.2.2
pandocfilters==1.5.1
parso==0.8.4
pexpect==4.9.0
pillow==10.4.0
platformdirs==4.2.2
prometheus_client==0.20.0
prompt_toolkit==3.0.47
protobuf==5.27.3
psutil==6.0.0
ptyprocess==0.7.0
pure_eval==0.2.3
pyarrow==17.0.0
pycparser==2.22
pydantic==2.8.2
pydantic_core==2.20.1
Pygments==2.18.0
pynvml==11.5.3
pypdf==4.3.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-json-logger==2.0.7
pytz==2024.1
PyYAML==6.0.2
pyzmq==26.1.1
qtconsole==5.5.2
QtPy==2.4.1
RAGatouille==0.0.8.post4
referencing==0.35.1
regex==2024.7.24
requests==2.32.3
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.20.0
safetensors==0.4.4
scikit-learn==1.5.1
scipy==1.14.0
Send2Trash==1.8.3
sentence-transformers==2.7.0
six==1.16.0
smmap==5.0.1
sniffio==1.3.1
soupsieve==2.6
SQLAlchemy==2.0.32
srsly==2.4.8
stack-data==0.6.3
striprtf==0.0.26
sympy==1.13.2
tenacity==8.5.0
terminado==0.18.1
threadpoolctl==3.5.0
tiktoken==0.7.0
tinycss2==1.3.0
tokenizers==0.19.1
torch==2.4.0
tornado==6.4.1
tqdm==4.66.5
traitlets==5.14.3
transformers==4.44.0
triton==3.0.0
types-python-dateutil==2.9.0.20240316
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.1
ujson==5.10.0
uri-template==1.3.0
urllib3==2.2.2
voyager==2.0.9
wcwidth==0.2.13
webcolors==24.8.0
webencodings==0.5.1
websocket-client==1.8.0
Werkzeug==3.0.3
widgetsnbextension==4.0.11
wrapt==1.16.0
xxhash==3.5.0
yarl==1.9.4

I'm running Nvidia driver version 535.183.06 and CUDA version 12.2 on ubuntu 22.04

Thank you for your help!

bclavie commented 3 months ago

Thanks for the detailed report.

I think this might be a case of Torch not properly relesing the lock, or thinking it's built the extension, but trying to load it partially.

Would you be able to retry after deleting the torch extensions cache? If you're on linux, have never set a TORCH_HOME env variable and have no other torch CPP extensions you care about, rm -rf ~/.cache/torch_extensions/* should do the trick.

TimKoornstra commented 3 months ago

Thanks for the reply, that seemed to have worked!