DeepWok / mase

Machine-Learning Accelerator System Exploration Tools
Other
108 stars 52 forks source link

Conda env not installing #16

Closed WillPowellUk closed 6 months ago

WillPowellUk commented 6 months ago

Using Anaconda3-2022.10 (or later) and Python Version 3.10.6 as required, the script scripts/init-conda.sh fails to install deepspeed:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Installing pip dependencies: \ Ran pip subprocess with arguments:
['/home/wfp23/anaconda3/envs/mase/bin/python', '-m', 'pip', 'install', '-U', '-r', '/home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt']
Pip subprocess output:
Collecting lightning (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 1))
  Using cached lightning-2.1.3-py3-none-any.whl.metadata (56 kB)
Collecting transformers (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 2))
  Using cached transformers-4.37.0-py3-none-any.whl.metadata (129 kB)
Collecting diffusers (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 3))
  Using cached diffusers-0.25.1-py3-none-any.whl.metadata (19 kB)
Collecting accelerate (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 4))
  Using cached accelerate-0.26.1-py3-none-any.whl.metadata (18 kB)
Collecting toml (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 5))
  Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
Collecting timm (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 6))
  Using cached timm-0.9.12-py3-none-any.whl.metadata (60 kB)
Collecting pytorch-nlp (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 7))
  Using cached pytorch_nlp-0.5.0-py3-none-any.whl (90 kB)
Collecting datasets (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 8))
  Using cached datasets-2.16.1-py3-none-any.whl.metadata (20 kB)
Collecting onnx (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 9))
  Using cached onnx-1.15.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (15 kB)
Collecting onnxruntime (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 10))
  Using cached onnxruntime-1.16.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.3 kB)
Collecting optimum (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 11))
  Using cached optimum-1.16.2-py3-none-any.whl.metadata (17 kB)
Collecting black (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 12))
  Using cached black-23.12.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (68 kB)
Collecting GitPython (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 13))
  Using cached GitPython-3.1.41-py3-none-any.whl.metadata (14 kB)
Collecting colorlog (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 14))
  Using cached colorlog-6.8.0-py3-none-any.whl.metadata (10 kB)
Collecting cocotb==1.8.0 (from cocotb[bus]==1.8.0->-r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 15))
  Using cached cocotb-1.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.8 kB)
Collecting pytest (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 16))
  Using cached pytest-7.4.4-py3-none-any.whl.metadata (7.9 kB)
Collecting pytest-cov (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 17))
  Using cached pytest_cov-4.1.0-py3-none-any.whl.metadata (26 kB)
Collecting pytest-xdist (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 18))
  Using cached pytest_xdist-3.5.0-py3-none-any.whl.metadata (3.1 kB)
Collecting pytest-sugar (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 19))
  Using cached pytest_sugar-0.9.7-py2.py3-none-any.whl (10 kB)
Collecting pytest-html (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 20))
  Using cached pytest_html-4.1.1-py3-none-any.whl.metadata (3.9 kB)
Collecting pytest-profiling (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 21))
  Using cached pytest_profiling-1.7.0-py2.py3-none-any.whl (8.3 kB)
Collecting ipython (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 22))
  Using cached ipython-8.20.0-py3-none-any.whl.metadata (5.9 kB)
Collecting ipdb (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 23))
  Using cached ipdb-0.13.13-py3-none-any.whl (12 kB)
Collecting sentencepiece (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 24))
  Using cached sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
Collecting einops (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 25))
  Using cached einops-0.7.0-py3-none-any.whl.metadata (13 kB)
Collecting deepspeed==0.3.5 (from -r /home/wfp23/ADL/mase/machop/condaenv.pza5spem.requirements.txt (line 26))
  Downloading deepspeed-0.3.5.tar.gz (207 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.5/207.5 kB 26.0 MB/s eta 0:00:00
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'error'

Pip subprocess error:
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [13 lines of output]
      Traceback (most recent call last):
        File "/tmp/pip-install-4v1g1vze/deepspeed_f57f4dccc17344fead82d595813905fb/setup.py", line 18, in <module>
          import torch
      ModuleNotFoundError: No module named 'torch'

      During handling of the above exception, another exception occurred:

      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-4v1g1vze/deepspeed_f57f4dccc17344fead82d595813905fb/setup.py", line 21, in <module>
          raise ImportError('Unable to import torch, please visit https://pytorch.org/ '
      ImportError: Unable to import torch, please visit https://pytorch.org/ to see how to properly install torch on your system.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

failed

CondaEnvException: Pip failed

I have tried manually installing torch using both pip and conda but does not resolve the problem.

Aaron-Zhao123 commented 6 months ago

Have you tried to activate the correct env for conda

conda activate mase

Then install pytorch locally, following this: https://pytorch.org/get-started/locally/

Then maybe first verify the installation by

which python
# make sure the printout is associated with the correct python
# /Users/aaron/.conda/envs/mase/bin/python
python -c "import torch; print(torch.__version__)"

Then run the pip related installs?

jianyicheng commented 6 months ago

Hi @WillPowellUk. Could you provide your commands for reproducing the error? Also, could you make sure you have pulled the latest commit?

ChengZhang-98 commented 6 months ago

Could you try pull, remove old mase env, and reinstall again? The error should be fixed in PR #20

WillPowellUk commented 6 months ago

Thank you @ChengZhang-98 and @jianyicheng! Your changes have solved the issue!