Open gyezheng opened 4 months ago
Hello!
Yes, these are requirements by the torch
Python package that are needed for you to use CUDA, i.e. a GPU. You can freely use them.
Note that if you don't have a GPU, then you may want to install torch
without CUDA support & then install sentence-transformers
. You can use this widget and select "CPU" if that's the case. It'll save you some disk space.
But, if you have a GPU, be sure to install with the CUDA support like you've been doing.
Thank you for your reply! We are the CPU only case. I understand from technical perspectives, we can freely use those Nvidia packages. But any idea about from commercial perspective, can we ship them within our our own commercial product? Any difference between GPU and CPU cases from commercial perspective? Thanks!
If you're using the CPU only, then you won't need those CUDA packages. You can install it with:
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install sentence-transformers
(assuming that you're on Linux).
And yes, torch
and sentence-transformers
have commercially permissive licenses, i.e. you can use these products within (paid) commercial products.
So at the moment, I have been running two pip commands, the first was installing a load of dependencies in a requirements.txt
and then the second was installing torch with the index url CPU parameter as you mentioned above.
pip install --no-deps -r requirements.txt
pip install --no-deps -r torch_requirements.txt
Maybe the order of installing sentence-transfromers in the first requirements.txt and then installing torch was pulling the 2.3.0 (with nvidia) version of torch along as well?
If I do pip show torch
I see:
Name: torch
Version: 1.13.1+cpu
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /usr/local/lib64/python3.9/site-packages
Requires: typing-extensions
Required-by: sentence-transformers, accelerate
So not sure why/how we are getting the nvidia packages in our scans?
If I do
pip show torch
I see:...
That is rather odd. Perhaps you can pip show cuda...
with the CUDA packages to see what they are required by? Because torch
with CPU should not require CUDA.
If I run pip show nvidia_cublas...
or pip show cuda
I get no packages found
... I'm not convinced we are downloading the files our scanner thinks were getting as I cannot locate them on disk at all, and in my site-packages folder I dont see anything about nvidia or any .whl files matching what our scanner is finding.
I also think if I was pulling them cuda files, the docker image would be a lot larger (its only 2.5GB ish total, think with CUDA files it would be 8GB+).
pip list
gives me:
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
contourpy 1.2.1
cycler 0.12.1
eland 8.12.1
elastic-transport 8.13.0
elasticsearch 8.13.0
filelock 3.14.0
fonttools 4.51.0
fsspec 2024.3.1
huggingface-hub 0.23.0
idna 3.7
importlib_resources 6.4.0
joblib 1.4.2
kiwisolver 1.4.5
matplotlib 3.8.4
nltk 3.8.1
numpy 1.26.4
packaging 24.0
pandas 1.5.3
pillow 10.3.0
pip 21.2.3
psutil 5.9.8
pyparsing 3.1.2
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.1
regex 2024.4.28
requests 2.31.0
safetensors 0.4.3
scikit-learn 1.4.2
scipy 1.13.0
sentence-transformers 2.2.2
setuptools 53.0.0
six 1.16.0
tdqm 0.0.1
threadpoolctl 3.5.0
tokenizers 0.14.1
torch 1.13.1+cpu
torchvision 0.14.1+cpu
tqdm 4.66.3
transformers 4.38.0
typing_extensions 4.9.0
urllib3 2.2.1
zipp 3.18.1
For extra info I also installed pipdeptree
and this was the output...
accelerate==0.29.3
├── huggingface-hub [required: Any, installed: 0.23.0]
│ ├── filelock [required: Any, installed: 3.14.0]
│ ├── fsspec [required: >=2023.5.0, installed: 2024.3.1]
│ ├── packaging [required: >=20.9, installed: 24.0]
│ ├── PyYAML [required: >=5.1, installed: 6.0.1]
│ ├── requests [required: Any, installed: 2.31.0]
│ │ ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│ │ ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│ │ ├── idna [required: >=2.5,<4, installed: 3.7]
│ │ └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
│ ├── tqdm [required: >=4.42.1, installed: 4.66.3]
│ └── typing_extensions [required: >=3.7.4.3, installed: 4.9.0]
├── numpy [required: >=1.17, installed: 1.26.4]
├── packaging [required: >=20.0, installed: 24.0]
├── psutil [required: Any, installed: 5.9.8]
├── PyYAML [required: Any, installed: 6.0.1]
├── safetensors [required: >=0.3.1, installed: 0.4.3]
└── torch [required: >=1.10.0, installed: 1.13.1+cpu]
└── typing_extensions [required: Any, installed: 4.9.0]
eland==8.12.1
├── elasticsearch [required: >=8.3,<9, installed: 8.13.0]
│ └── elastic-transport [required: >=8.13,<9, installed: 8.13.0]
│ ├── certifi [required: Any, installed: 2024.2.2]
│ └── urllib3 [required: >=1.26.2,<3, installed: 2.2.1]
├── matplotlib [required: >=3.6, installed: 3.8.4]
│ ├── contourpy [required: >=1.0.1, installed: 1.2.1]
│ │ └── numpy [required: >=1.20, installed: 1.26.4]
│ ├── cycler [required: >=0.10, installed: 0.12.1]
│ ├── fonttools [required: >=4.22.0, installed: 4.51.0]
│ ├── importlib_resources [required: >=3.2.0, installed: 6.4.0]
│ │ └── zipp [required: >=3.1.0, installed: 3.18.1]
│ ├── kiwisolver [required: >=1.3.1, installed: 1.4.5]
│ ├── numpy [required: >=1.21, installed: 1.26.4]
│ ├── packaging [required: >=20.0, installed: 24.0]
│ ├── pillow [required: >=8, installed: 10.3.0]
│ ├── pyparsing [required: >=2.3.1, installed: 3.1.2]
│ └── python-dateutil [required: >=2.7, installed: 2.9.0.post0]
│ └── six [required: >=1.5, installed: 1.16.0]
├── numpy [required: >=1.2.0,<2, installed: 1.26.4]
├── packaging [required: Any, installed: 24.0]
└── pandas [required: >=1.5,<2, installed: 1.5.3]
├── numpy [required: >=1.20.3, installed: 1.26.4]
├── python-dateutil [required: >=2.8.1, installed: 2.9.0.post0]
│ └── six [required: >=1.5, installed: 1.16.0]
└── pytz [required: >=2020.1, installed: 2024.1]
pipdeptree==2.20.0
├── packaging [required: >=23.1, installed: 24.0]
└── pip [required: >=23.1.2, installed: 24.0]
sentence-transformers==2.2.2
├── huggingface-hub [required: >=0.4.0, installed: 0.23.0]
│ ├── filelock [required: Any, installed: 3.14.0]
│ ├── fsspec [required: >=2023.5.0, installed: 2024.3.1]
│ ├── packaging [required: >=20.9, installed: 24.0]
│ ├── PyYAML [required: >=5.1, installed: 6.0.1]
│ ├── requests [required: Any, installed: 2.31.0]
│ │ ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│ │ ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│ │ ├── idna [required: >=2.5,<4, installed: 3.7]
│ │ └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
│ ├── tqdm [required: >=4.42.1, installed: 4.66.3]
│ └── typing_extensions [required: >=3.7.4.3, installed: 4.9.0]
├── nltk [required: Any, installed: 3.8.1]
│ ├── click [required: Any, installed: 8.1.7]
│ ├── joblib [required: Any, installed: 1.4.2]
│ ├── regex [required: >=2021.8.3, installed: 2024.4.28]
│ └── tqdm [required: Any, installed: 4.66.3]
├── numpy [required: Any, installed: 1.26.4]
├── scikit-learn [required: Any, installed: 1.4.2]
│ ├── joblib [required: >=1.2.0, installed: 1.4.2]
│ ├── numpy [required: >=1.19.5, installed: 1.26.4]
│ ├── scipy [required: >=1.6.0, installed: 1.13.0]
│ │ └── numpy [required: >=1.22.4,<2.3, installed: 1.26.4]
│ └── threadpoolctl [required: >=2.0.0, installed: 3.5.0]
├── scipy [required: Any, installed: 1.13.0]
│ └── numpy [required: >=1.22.4,<2.3, installed: 1.26.4]
├── sentencepiece [required: Any, installed: ?]
├── torch [required: >=1.6.0, installed: 1.13.1+cpu]
│ └── typing_extensions [required: Any, installed: 4.9.0]
├── torchvision [required: Any, installed: 0.14.1+cpu]
│ ├── numpy [required: Any, installed: 1.26.4]
│ ├── pillow [required: >=5.3.0,!=8.3.*, installed: 10.3.0]
│ ├── requests [required: Any, installed: 2.31.0]
│ │ ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│ │ ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│ │ ├── idna [required: >=2.5,<4, installed: 3.7]
│ │ └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
│ ├── torch [required: ==1.13.1, installed: 1.13.1+cpu]
│ │ └── typing_extensions [required: Any, installed: 4.9.0]
│ └── typing_extensions [required: Any, installed: 4.9.0]
├── tqdm [required: Any, installed: 4.66.3]
└── transformers [required: >=4.6.0,<5.0.0, installed: 4.38.0]
├── filelock [required: Any, installed: 3.14.0]
├── huggingface-hub [required: >=0.19.3,<1.0, installed: 0.23.0]
│ ├── filelock [required: Any, installed: 3.14.0]
│ ├── fsspec [required: >=2023.5.0, installed: 2024.3.1]
│ ├── packaging [required: >=20.9, installed: 24.0]
│ ├── PyYAML [required: >=5.1, installed: 6.0.1]
│ ├── requests [required: Any, installed: 2.31.0]
│ │ ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│ │ ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│ │ ├── idna [required: >=2.5,<4, installed: 3.7]
│ │ └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
│ ├── tqdm [required: >=4.42.1, installed: 4.66.3]
│ └── typing_extensions [required: >=3.7.4.3, installed: 4.9.0]
├── numpy [required: >=1.17, installed: 1.26.4]
├── packaging [required: >=20.0, installed: 24.0]
├── PyYAML [required: >=5.1, installed: 6.0.1]
├── regex [required: !=2019.12.17, installed: 2024.4.28]
├── requests [required: Any, installed: 2.31.0]
│ ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│ ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│ ├── idna [required: >=2.5,<4, installed: 3.7]
│ └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
├── safetensors [required: >=0.4.1, installed: 0.4.3]
├── tokenizers [required: >=0.14,<0.19, installed: 0.14.1]
│ └── huggingface-hub [required: >=0.16.4,<0.18, installed: 0.23.0]
│ ├── filelock [required: Any, installed: 3.14.0]
│ ├── fsspec [required: >=2023.5.0, installed: 2024.3.1]
│ ├── packaging [required: >=20.9, installed: 24.0]
│ ├── PyYAML [required: >=5.1, installed: 6.0.1]
│ ├── requests [required: Any, installed: 2.31.0]
│ │ ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│ │ ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│ │ ├── idna [required: >=2.5,<4, installed: 3.7]
│ │ └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
│ ├── tqdm [required: >=4.42.1, installed: 4.66.3]
│ └── typing_extensions [required: >=3.7.4.3, installed: 4.9.0]
└── tqdm [required: >=4.27, installed: 4.66.3]
setuptools==53.0.0
tdqm==0.0.1
└── tqdm [required: Any, installed: 4.66.3]
I think that looks fine, then! In fact, if you increase from sentence_transformers==2.2.2
to a more recent version, then you'll actually lose the NLTK and sentencepiece dependencies. Although they're not particularly big, so I wouldn't worry about it too much.
Do you happen to know if theres a check I can make to just completely know if them nvidia**.whl files got installed? I had a look in /usr/bin and /usr/lib/python3.9/site-packages and didn't find anything, also running `find / -iname .whland
find / -iname "nvidia"` returns nothing
Searching for cud
might also help, but other than that I'm not sure
I had a look and it found a load of related files, from torch, torchgen, and transformers... most of the files are like:
/usr/local/lib64/python3.9/site-packages/torch/include/ATen/cuda/CUDATensorMethods.cuh
and associated header files or like /usr/local/lib/python3.9/site-packages/transformers/kernels/mra/cuda_kernel.cu
I think these are just source code files from these packages tho, not the Nvidia Propriety Software
I'm also facing the same issue, where nvidia* packages are not getting downloaded, not being used also in our product.(our application runs on windows, where the inventory report shows wheel packages). Please let us know if there is any update.
Are you using an OSS scanning tool such as Mend? Our issue was around Mend under the covers doing a pip download and ignoring the fact we were doing --no-deps when installing the package, so the full pip download was getting dependencies we were not getting.
Yes, we are using mend, integrated with github repo and Mend inventory shows these nvidia packages. and our Open source approval team says not to use nvidia even though we are using it in our product. Kindly let me know how to proceed further.
I'm a bit confused
where nvidia* packages are not getting downloaded, not being used also in our product.
Mend inventory shows these nvidia* packages. [...] even though we are using it in our product.
So the packages are not being downloaded, but would you like to download them or not?
In short, to use Sentence Transformers, you will have to use torch
. You can install torch
with GPU/CUDA support, or without it. To get GPU support, you will have to install torch
with CUDA support, which means that you'll require NVIDIA CUDA-specific packages, e.g.:
pip install torch --index-url https://download.pytorch.org/whl/cu121
If you only want to run Sentence Transformers on CPU, then you don't need to install torch with CUDA, e.g.:
pip install torch --index-url https://download.pytorch.org/whl/cpu
The latter should not install NVIDIA's CUDA packages, I believe.
We do not use nvidia packages. Our application is doesn't need this. Problem with Mend inventory as it shows nvidia packages. Open source team says don't use nvidia*.
If you are using it like our use case, we were installing using the CPU version which doesn't get the GPU related stuff, but mend looks at the packages installed and seems to just ignore the option (CPU specific, no dependencies etc) and just downloads everything and then sees that "ah, Sentence Transformers requires Nvidia packages" which would be right if we weren't using the CPU specific variant.
It's an issue on Mend.io than on this library though. It's how they do they're checking that causes the Nvidia packages to be detected when they aren't actually present. We are using them in a Docker image and you can tell we don't get them as we looked through the system and cant find them and the image is small, if we were pulling them the image would be 100s MBs larger than it is.
I am using sentence-transformers-2.2.2.tar.gz while it pulls the following nvidia packages
nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl
When I search them online, it shows they are under license: NVIDIA Proprietary Software. Can I freely use sentence-transformers-2.2.2.tar.gz?
Thanks!