Open Jimmy9507 opened 9 months ago
I just started seeing this issue today too in a sagemaker-deployed job and haven't figured out a fix for it. Are you using a Red Hat Linux base Docker image?
Digging into this a bit more, I have a hunch that the issue may be in conda-forge
as mentioned here.
I copied /usr/include/crypt.h
to the /opt/conda/include/crypt.h
and that fixed my issue. I'm looking close into whether there's a modification to the conda build to ensure that crypt.h
is installed in the include/
of the conda environment being used.
Ah thanks @filpia for the catch.... I don't really have a strong insight on this right away, it's entirely outside colbert per se, it's just the compiler as you noted
@okhat I had success by running cp /usr/include/crypt.h /opt/conda/include/python3.8/
to copy the crypt.h
file into the location where it's expected at runtime. In your case I think you'd modify to cp /usr/include/crypt.h /opt/conda/envs/colbert/include/python3.8/
.
Another choice depending on how much you're able to modify your environment is...
conda install --channel=conda-forge libxcrypt
export CPATH=/opt/conda/include/
or possibly even export CPATH=/opt/conda/envs/colbert/include
.
Either way, good luck!
Yeah I also copied crypt.h to python environment and it works. But wondering if there is a easy way to include it in conda env creation step.
Same here. Copying from /usr/include/
works and curious why
@jinghan23 because in this case pytorch
is looking for crypt.h
in a specific directory. Previous to the conda-forge
change, the file was being created at the anticipated location but following the changes it seems like it's being created elsewhere. This is just a lazy fix to ensure that the file exists where pytorch
is expecting. FWIW I'm seeing similar behavior with theano
@okhat I had success by running
cp /usr/include/crypt.h /opt/conda/include/python3.8/
to copy thecrypt.h
file into the location where it's expected at runtime. In your case I think you'd modify tocp /usr/include/crypt.h /opt/conda/envs/colbert/include/python3.8/
.Another choice depending on how much you're able to modify your environment is...
conda install --channel=conda-forge libxcrypt export CPATH=/opt/conda/include/
or possibly even
export CPATH=/opt/conda/envs/colbert/include
.Either way, good luck!
l love you! Thanks to your help, I start to train the model successfully. This is my first try to fine-tuned model, and I think it seems like a great start for me with your help! from a new bird.
When I run the code below in sagemaker instances.
from colbert.data import Queries from colbert.infra import Run, RunConfig, ColBERTConfig from colbert import Searcher
if name=='main': with Run().context(RunConfig(nranks=1, experiment="msmarco")):
I got below error:
RuntimeError: Error building extension 'segmented_maxsim_cpp': [1/2] c++ -MMD -MF segmented_maxsim.o.d -DTORCH_EXTENSION_NAME=segmented_maxsim_cpp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/THC -isystem /opt/conda/envs/colbert/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /opt/conda/envs/colbert/lib/python3.8/site-packages/colbert/modeling/segmented_maxsim.cpp -o segmented_maxsim.o FAILED: segmented_maxsim.o c++ -MMD -MF segmented_maxsim.o.d -DTORCH_EXTENSION_NAME=segmented_maxsim_cpp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/THC -isystem /opt/conda/envs/colbert/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /opt/conda/envs/colbert/lib/python3.8/site-packages/colbert/modeling/segmented_maxsim.cpp -o segmented_maxsim.o In file included from /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:12, from /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/Device.h:4, from /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8, from /opt/conda/envs/colbert/lib/python3.8/site-packages/torch/include/torch/extension.h:6, from /opt/conda/envs/colbert/lib/python3.8/site-packages/colbert/modeling/segmented_maxsim.cpp:2: /opt/conda/envs/colbert/include/python3.8/Python.h:44:10: fatal error: crypt.h: No such file or directory 44 | #include
| ^
~~~~ compilation terminated. ninja: build stopped: subcommand failed.I checked crypt.h is in /usr/include. Why i still get this error?