elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
https://eland.readthedocs.io
Apache License 2.0
635 stars 98 forks source link

Stop installing GPU dependencies #600

Closed pquentin closed 11 months ago

pquentin commented 12 months ago

Installing all Eland dependencies locally or in the Docker image installs 4.2GB of dependencies, the two largest being PyTorch (1.9GB) and the CUDA dependencies (https://pypi.org/project/nvidia-cudnn-cu11/, https://pypi.org/project/nvidia-cublas-cu11/, https://pypi.org/project/nvidia-cuda-nvrtc-cu11/ and https://pypi.org/project/nvidia-cuda-runtime-cu11/, taking 1.4GB).

Since we don't support GPUs, installing those is wasteful, and we should install the PyTorch variant that only supports CPUs:

pip install torch==1.13.1+cpu --extra-index-url https://download.pytorch.org/whl/cpu

It's not possible to specify the extra index URL when doing python -m pip install eland[pytorch], but we could instead ask to run the above command. In any case, it's an easy fix in the Docker image which is probably the main way that PyTorhc is used with Eland today (and #407 will make this even more true).