stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
MIT License
2.67k stars 355 forks source link

crypt.h: No such file or directory #309

Open Jimmy9507 opened 4 months ago

Jimmy9507 commented 4 months ago

When I run the code below in sagemaker instances.

from colbert.data import Queries from colbert.infra import Run, RunConfig, ColBERTConfig from colbert import Searcher

if name=='main': with Run().context(RunConfig(nranks=1, experiment="msmarco")):

    config = ColBERTConfig(
        root="/root/experiments",
    )
    searcher = Searcher(index="msmarco.nbits=2", config=config)
    queries = Queries("/root/queries.dev.small.tsv")
    ranking = searcher.search_all(queries, k=100)
    ranking.save("msmarco.nbits=2.ranking.tsv")

I got below error:

RuntimeError: Error building extension 'segmented_maxsim_cpp': [1/2] c++ -MMD -MF segmented_maxsim.o.d -DTORCH_EXTENSION_NAME=segmented_maxsim_cpp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/THC -isystem /opt/conda/envs/colbert/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /opt/conda/envs/colbert/lib/python3.8/site-packages/colbert/modeling/segmented_maxsim.cpp -o segmented_maxsim.o FAILED: segmented_maxsim.o c++ -MMD -MF segmented_maxsim.o.d -DTORCH_EXTENSION_NAME=segmented_maxsim_cpp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/THC -isystem /opt/conda/envs/colbert/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /opt/conda/envs/colbert/lib/python3.8/site-packages/colbert/modeling/segmented_maxsim.cpp -o segmented_maxsim.o In file included from /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:12, from /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/Device.h:4, from /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8, from /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/extension.h:6, from /opt/conda/envs/colbert/lib/python3.8/site-packages/colbert/modeling/segmented_maxsim.cpp:2: /opt/conda/envs/colbert/include/python3.8/Python.h:44:10: fatal error: crypt.h: No such file or directory 44 | #include | ^~~~~ compilation terminated. ninja: build stopped: subcommand failed.

I checked crypt.h is in /usr/include. Why i still get this error?

filpia commented 4 months ago

I just started seeing this issue today too in a sagemaker-deployed job and haven't figured out a fix for it. Are you using a Red Hat Linux base Docker image?

filpia commented 4 months ago

Digging into this a bit more, I have a hunch that the issue may be in conda-forge as mentioned here.

I copied /usr/include/crypt.h to the /opt/conda/include/crypt.h and that fixed my issue. I'm looking close into whether there's a modification to the conda build to ensure that crypt.h is installed in the include/ of the conda environment being used.

okhat commented 4 months ago

Ah thanks @filpia for the catch.... I don't really have a strong insight on this right away, it's entirely outside colbert per se, it's just the compiler as you noted

filpia commented 4 months ago

@okhat I had success by running cp /usr/include/crypt.h /opt/conda/include/python3.8/ to copy the crypt.h file into the location where it's expected at runtime. In your case I think you'd modify to cp /usr/include/crypt.h /opt/conda/envs/colbert/include/python3.8/.

Another choice depending on how much you're able to modify your environment is...

conda install --channel=conda-forge libxcrypt

export CPATH=/opt/conda/include/

or possibly even export CPATH=/opt/conda/envs/colbert/include.

Either way, good luck!

Jimmy9507 commented 4 months ago

Yeah I also copied crypt.h to python environment and it works. But wondering if there is a easy way to include it in conda env creation step.

jinghan23 commented 4 months ago

Same here. Copying from /usr/include/ works and curious why

filpia commented 4 months ago

@jinghan23 because in this case pytorch is looking for crypt.h in a specific directory. Previous to the conda-forge change, the file was being created at the anticipated location but following the changes it seems like it's being created elsewhere. This is just a lazy fix to ensure that the file exists where pytorch is expecting. FWIW I'm seeing similar behavior with theano