microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://www.deepspeed.ai/
Apache License 2.0
35.12k stars 4.06k forks source link

[BUG] if not install cuda,pip3 install deepspeed==0.14.0, failed in installed_cuda_version() #6636

Open hijeffwu opened 2 hours ago

hijeffwu commented 2 hours ago

Describe the bug

not install cuda, and not cpu,gpu device.

pip3 install deepspeed==0.14.0, failed in installed_cuda_version()

To Reproduce Steps to reproduce the behavior:

not install cuda in sys, pip3 install deepspeed==0.14.0

Expected behavior A clear and concise description of what you expected to happen.

ds_report output

Preparing metadata (setup.py) ... error ERROR: Command errored out with exit status 1: command: /usr/bin/python3 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-1kmwvije/deepspeed_6c20884bfa084655b99473b9abf58814/setup.py'"'"'; file='"'"'/tmp/pip-install-1kmwvije/deepspeed_6c20884bfa084655b99473b9abf58814/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egginfo --egg-base /tmp/pip-pip-egg-info-2j2fd59 cwd: /tmp/pip-install-1kmwvije/deepspeed_6c20884bfa084655b99473b9abf58814/ Complete output (25 lines): [2024-10-17 08:56:34,510] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) ....................................... File "", line 1, in File "/tmp/pip-install-1kmwvije/deepspeed_6c20884bfa084655b99473b9abf58814/setup.py", line 100, in cuda_major_ver, cuda_minor_ver = installed_cuda_version() File "/tmp/pip-install-1kmwvije/deepspeed_6c20884bfa084655b99473b9abf58814/op_builder/builder.py", line 50, in installed_cuda_version raise MissingCUDAException("CUDA_HOME does not exist, unable to compile CUDA op(s)") op_builder.builder.MissingCUDAException: CUDA_HOME does not exist, unable to compile CUDA op(s)

..............................................................................

Screenshots If applicable, add screenshots to help explain your problem.

System info (please complete the following information):

Launcher context Are you launching your experiment with the deepspeed launcher, MPI, or something else?

Docker context Are you using a specific docker image that you can share?

Additional context Add any other context about the problem here.