Open atalman opened 1 year ago
I see a reduction between 1.13.1
to 2.0.0
:
Collecting torch==1.13.1
Using cached torch-1.13.1-cp310-cp310-manylinux1_x86_64.whl (887.5 MB)
Collecting nvidia-cuda-nvrtc-cu11==11.7.99
Using cached nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
Collecting typing-extensions
Using cached typing_extensions-4.5.0-py3-none-any.whl (27 kB)
Collecting nvidia-cuda-runtime-cu11==11.7.99
Using cached nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
Collecting nvidia-cudnn-cu11==8.5.0.96
Using cached nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
Collecting nvidia-cublas-cu11==11.10.3.66
Using cached nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
Collecting setuptools
Using cached setuptools-67.6.1-py3-none-any.whl (1.1 MB)
Collecting wheel
Using cached wheel-0.40.0-py3-none-any.whl (64 kB)
vs.
Collecting torch==2.0.0
Using cached torch-2.0.0-cp310-cp310-manylinux1_x86_64.whl (619.9 MB)
Collecting jinja2
Using cached Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting nvidia-cusolver-cu11==11.4.0.1
Using cached nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)
Collecting nvidia-cuda-runtime-cu11==11.7.99
Using cached nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
Collecting triton==2.0.0
Using cached triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)
Collecting sympy
Using cached sympy-1.11.1-py3-none-any.whl (6.5 MB)
Collecting nvidia-cuda-cupti-cu11==11.7.101
Using cached nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
Collecting nvidia-cufft-cu11==10.9.0.58
Using cached nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
Collecting nvidia-curand-cu11==10.2.10.91
Using cached nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
Collecting filelock
Using cached filelock-3.10.7-py3-none-any.whl (10 kB)
Collecting networkx
Using cached networkx-3.0-py3-none-any.whl (2.0 MB)
Collecting nvidia-cuda-nvrtc-cu11==11.7.99
Using cached nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
Collecting nvidia-nccl-cu11==2.14.3
Using cached nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB)
Collecting nvidia-nvtx-cu11==11.7.91
Using cached nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)
Collecting nvidia-cudnn-cu11==8.5.0.96
Using cached nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
Collecting typing-extensions
Using cached typing_extensions-4.5.0-py3-none-any.whl (27 kB)
Collecting nvidia-cusparse-cu11==11.7.4.91
Using cached nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
Collecting nvidia-cublas-cu11==11.10.3.66
Using cached nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
Collecting setuptools
Using cached setuptools-67.6.1-py3-none-any.whl (1.1 MB)
Collecting wheel
Using cached wheel-0.40.0-py3-none-any.whl (64 kB)
Collecting lit
Using cached lit-16.0.0.tar.gz (144 kB)
Preparing metadata (setup.py) ... done
Collecting cmake
Using cached cmake-3.26.1-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (24.0 MB)
Collecting MarkupSafe>=2.0
Using cached MarkupSafe-2.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Collecting mpmath>=0.19
Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)
in the actual torch
wheel as more CUDA pip wheel dependencies were added. Is the total size the concern here?
Yeah so the core problem was when upgrading from torchserve 0.6.1 to 0.7 we upgraded CUDA versions to 11 and torch to 1.13 and that almost doubled our docker image size and one of our customers was concerned about this
Check out layer 53 for the problem
I also checked for Ubuntu package sizes and indeed CUDA seems to be a big contributor to this increase (about 800MB difference from CUDA 10.2 to 11.7)
And compared all the pip package size differences (a 1.6GB difference for torch)
I'm not sure how pip computes download sizes but my script is doing this https://gist.github.com/msaroufim/098c0478bd2c629312acaa59b535fa9c#file-download_size_torch-py-L16
cc @agunapal
As found by this script: https://gist.github.com/msaroufim/098c0478bd2c629312acaa59b535fa9c Here are logs: https://gist.github.com/msaroufim/045f4686fdb9fc46571b451e160563a3
cc @msaroufim