Closed beebopkim closed 4 months ago
Interesting - torch.compile does not seem to work then. you might need gcc as a c++ installed. Best with e.g. build-essential?
Otherwise: Could you provide some longer logs - seems like they are not complete at the end.
Of course, gcc and g++ have been installed. Here are some version infos from my Linux machine:
% uname -a
Linux eleonora 6.5.0-21-generic #21~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb 9 13:32:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
% gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
% g++ --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
% nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Wed_Nov_22_10:17:15_PST_2023
Cuda compilation tools, release 12.3, V12.3.107
Build cuda_12.3.r12.3/compiler.33567101_0
%
And for log, I tried to run infinity_emb --device cuda --log-level debug
to get more logs, but exactly same log was produced. % is the shell prompt because I am using zsh.
Oh, I missed the segfault at the end of the script. What GPU is this on? Is the same happening via dockerfile (cuda12.1)? Have you used other models with torch.compile?
My GPU is PNY 4060 Ti 16GB. And with the docker image, it run without any problem. Then II found that there was Python 3.10 in the docker image. So I tried with venv of 3.10, and it run well too. Strange, but the problem was solved. And for pip install infinity-emb[all]
it did not work on both Python 3.10 and 3.11 in my case. That's why I installed infinity with poetry from the source code on Python 3.11.
Okay, if the docker image runs, I can provide no further assistance - its to hard to debug, I would guess some c++ extension might be incompatible.
Please install all the (pip and system/apt) dependencies from the image on your system, or disable torch.compile
I see. Here I add the package list of pip freeze
from the docker image michaelf34/infinity
tag 0.026 linux/amd64. If there is anyone who has the same problem with me, he/she doesn't need to pull docker image to get this list.
aiohttp==3.9.3
aiosignal==1.3.1
annotated-types==0.6.0
anyio==3.7.1
async-timeout==4.0.3
attrs==23.2.0
certifi==2024.2.2
charset-normalizer==3.3.2
click==8.1.7
codespell==2.2.6
colorama==0.4.6
coloredlogs==15.0.1
ctranslate2==4.0.0
datasets==2.14.4
dill==0.3.7
diskcache==5.6.3
evaluate==0.4.1
exceptiongroup==1.2.0
fastapi==0.103.2
fastembed==0.2.1
filelock==3.13.1
flatbuffers==23.5.26
frozenlist==1.4.1
fsspec==2024.2.0
h11==0.14.0
httptools==0.6.1
huggingface-hub==0.20.3
humanfriendly==10.0
idna==3.6
# Editable install with no version control (infinity_emb==0.0.26)
-e /app
Jinja2==3.1.3
joblib==1.3.2
loguru==0.7.2
markdown-it-py==3.0.0
MarkupSafe==2.1.5
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.15
networkx==3.2.1
numpy==1.26.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.3.101
nvidia-nvtx-cu12==12.1.105
onnx==1.15.0
onnxruntime==1.17.0
optimum==1.17.1
orjson==3.9.14
packaging==23.2
pandas==2.2.0
pillow==10.2.0
prometheus-fastapi-instrumentator==6.1.0
prometheus_client==0.20.0
protobuf==4.25.3
pyarrow==15.0.0
pydantic==2.6.1
pydantic_core==2.16.2
Pygments==2.17.2
python-dateutil==2.8.2
python-dotenv==1.0.1
pytz==2024.1
PyYAML==6.0.1
regex==2023.12.25
requests==2.31.0
responses==0.18.0
rich==13.7.0
safetensors==0.4.2
scikit-learn==1.4.1.post1
scipy==1.12.0
sentence-transformers==2.4.0
sentencepiece==0.1.99
shellingham==1.5.4
six==1.16.0
sniffio==1.3.0
starlette==0.27.0
sympy==1.12
threadpoolctl==3.3.0
tokenizers==0.15.2
torch==2.2.0
tqdm==4.66.2
transformers==4.37.2
triton==2.2.0
typer==0.9.0
typing_extensions==4.9.0
tzdata==2024.1
urllib3==2.2.0
uvicorn==0.23.2
uvloop==0.19.0
watchfiles==0.21.0
websockets==12.0
xxhash==3.4.1
yarl==1.9.4
Whats the advantage of your pip freeze - is this more helpful than poetry lock? https://github.com/michaelfeil/infinity/blob/main/libs/infinity_emb/poetry.lock
Because I'm very new to poetry?
Closing for stale.
commit hash: 296472eefaa93c361f086ea26bd7cd7e3c6e9a3e
I tried it on my Linux machne - Ubuntu 22.04 with CUDA 12.3, and it was failed.
I found issue #115 and
export INFINITY_DISABLE_COMPILE=TRUE
works. But it is strange that the default setting was failed. It is very strange.