Closed brucethemoose closed 1 year ago
What accelerate
version are you using here? I just tried to repro this with accelerate==0.15.0 + aef8a85 and it doesn't seem to trigger an error.
ds_report output:
DeepSpeed general environment info:
torch install path ............... ['/home/jerasley/base/lib/python3.8/site-packages/torch']
torch version .................... 1.13.1+cu116
deepspeed install path ........... ['/home/jerasley/base/lib/python3.8/site-packages/deepspeed']
deepspeed info ................... 0.8.0, aef8a856, master
torch cuda version ............... 11.6
torch hip version ................ None
nvcc version ..................... 11.7
deepspeed wheel compiled w. ...... torch 1.13, cuda 11.6
pip show accelerate transformers
Name: accelerate
Version: 0.15.0
Summary: Accelerate
Home-page: https://github.com/huggingface/accelerate
Author: The HuggingFace team
Author-email: sylvain@huggingface.co
License: Apache
Location: /home/jerasley/base/lib/python3.8/site-packages
Requires: numpy, packaging, psutil, pyyaml, torch
Required-by:
---
Name: transformers
Version: 4.25.1
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache
Location: /home/jerasley/base/lib/python3.8/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, tokenizers, tqdm
Required-by: deepspeed-mii, mii
Here's my output of accelerate config
:
@jeffra
tmp/DeepSpeed master
❯ ds_report
Traceback (most recent call last):
File "/home/alpha/.local/bin/ds_report", line 3, in <module>
from deepspeed.env_report import cli_main
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/__init__.py", line 14, in <module>
from . import ops
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/__init__.py", line 1, in <module>
from . import adam
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/adam/__init__.py", line 2, in <module>
from .fused_adam import FusedAdam
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/adam/fused_adam.py", line 13, in <module>
from deepspeed.ops.op_builder.builder_names import FusedAdamBuilder
ImportError: cannot import name 'FusedAdamBuilder' from 'deepspeed.ops.op_builder.builder_names' (/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder_names.py)
/tmp/DeepSpeed master
❯ pip show accelerate transformers
WARNING: Ignoring invalid distribution -orch (/home/alpha/.local/lib/python3.10/site-packages)
Name: accelerate
Version: 0.15.0
Summary: Accelerate
Home-page: https://github.com/huggingface/accelerate
Author: The HuggingFace team
Author-email: sylvain@huggingface.co
License: Apache
Location: /home/alpha/.local/lib/python3.10/site-packages
Requires: numpy, packaging, psutil, pyyaml, torch
Required-by: k-diffusion
---
Name: transformers
Version: 4.25.1
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache
Location: /home/alpha/.local/lib/python3.10/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, tokenizers, tqdm
Required-by:
/tmp/DeepSpeed master
❯ accelerate config
2023-01-12 19:16:24.568398: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "/home/alpha/.local/bin/accelerate", line 5, in <module>
from accelerate.commands.accelerate_cli import main
File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/__init__.py", line 7, in <module>
from .accelerator import Accelerator
File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/accelerator.py", line 27, in <module>
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/checkpointing.py", line 24, in <module>
from .utils import (
File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/utils/__init__.py", line 122, in <module>
from .other import (
File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/utils/other.py", line 27, in <module>
from deepspeed import DeepSpeedEngine
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/__init__.py", line 14, in <module>
from . import ops
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/__init__.py", line 1, in <module>
from . import adam
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/adam/__init__.py", line 2, in <module>
from .fused_adam import FusedAdam
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/adam/fused_adam.py", line 13, in <module>
from deepspeed.ops.op_builder.builder_names import FusedAdamBuilder
ImportError: cannot import name 'FusedAdamBuilder' from 'deepspeed.ops.op_builder.builder_names' (/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder_names.py)
Running Python 3.10.9, here are my packages:
What I am testing on locally:
.-------------------------: alpha@Asus-GA401IV
.+=========================. ------------------
:++===++==================- :++- OS: CachyOS Linux x86_64
:*++====+++++=============- .==: Host: ROG Zephyrus G14 GA401IV_GA401IV (1.0)
-*+++=====+***++==========: Kernel: 6.1.4-1-cachyos-lto
=*++++========------------: Uptime: 5 hours, 21 mins
=*+++++=====- ... Packages: 1250 (pacman)
.+*+++++=-===: .=+++=: Shell: fish 3.5.1
:++++=====-==: -*****+ Resolution: 3840x2160 @ 60Hz
:++========-=. .=+**+. DE: KDE Plasma 5.26.5
.+==========-. . WM: KWin (Wayland)
:+++++++====- .--==-. WM Theme: Breeze
:++==========. :+++++++: Theme: Lightly (CachyOSNord) [QT], cachyos-nor]
.-===========. =*****+*+ Icons: breeze-dark [QT], breeze-dark [GTK2/3/4]
.-===========: .+*****+: Font: Noto Sans (10pt) [QT], Noto Sans (10pt) ]
-=======++++:::::::::::::::::::::::::-: .---: Cursor: capitaine (24px)
:======++++====+++******************=. Terminal: alacritty
:=====+++==========++++++++++++++*- Terminal Font: monospace (12pt)
.====++==============++++++++++*- CPU: AMD Ryzen 9 4900HS (16) @ 3 GHz
.===+==================+++++++: GPU: AMD Renoir
.-=======================+++: GPU: NVIDIA GeForce RTX 2060 Max-Q
.......................... Memory: 8.22 GiB / 15.05 GiB (54%)
Disk (/): 110 GiB / 139 GiB (78%)
Disk (/home/alpha/Storage): 310 GiB / 344 GiB )
Disk (/run/media/alpha/External): 140 GiB / 93]
Disk (/windows): 296 GiB / 434 GiB (68%) [Remo]
Battery: 100% [Not charging]
Locale: en_US.UTF-8
(And that tensorflow warning is just a quirk of the native Arch Linux package)
And the output of a fresh installation attempt:
tmp
❯ git clone https://github.com/microsoft/DeepSpeed
Cloning into 'DeepSpeed'...
remote: Enumerating objects: 28589, done.
remote: Counting objects: 100% (421/421), done.
remote: Compressing objects: 100% (256/256), done.
remote: Total 28589 (delta 248), reused 295 (delta 165), pack-reused 28168
Receiving objects: 100% (28589/28589), 33.57 MiB | 11.31 MiB/s, done.
Resolving deltas: 100% (20499/20499), done.
/tmp
❯ cd DeepSpeed/
/tmp/DeepSpeed master
❯ pip install .
Defaulting to user installation because normal site-packages is not writeable
WARNING: Ignoring invalid distribution -orch (/home/alpha/.local/lib/python3.10/site-packages)
WARNING: Ignoring invalid distribution -orch (/home/alpha/.local/lib/python3.10/site-packages)
Processing /tmp/DeepSpeed
Preparing metadata (setup.py) ... done
Requirement already satisfied: hjson in /home/alpha/.local/lib/python3.10/site-packages (from deepspeed==0.8.0+aef8a856) (3.1.0)
Requirement already satisfied: ninja in /home/alpha/.local/lib/python3.10/site-packages (from deepspeed==0.8.0+aef8a856) (1.11.1)
Requirement already satisfied: numpy in /home/alpha/.local/lib/python3.10/site-packages (from deepspeed==0.8.0+aef8a856) (1.23.5)
Requirement already satisfied: packaging in /home/alpha/.local/lib/python3.10/site-packages (from deepspeed==0.8.0+aef8a856) (22.0)
Requirement already satisfied: psutil in /usr/lib/python3.10/site-packages (from deepspeed==0.8.0+aef8a856) (5.9.4)
Requirement already satisfied: py-cpuinfo in /home/alpha/.local/lib/python3.10/site-packages (from deepspeed==0.8.0+aef8a856) (9.0.0)
Requirement already satisfied: pydantic in /usr/lib/python3.10/site-packages (from deepspeed==0.8.0+aef8a856) (1.10.4)
Requirement already satisfied: torch in /home/alpha/.local/lib/python3.10/site-packages (from deepspeed==0.8.0+aef8a856) (2.0.0.dev20230112+cu118)
Requirement already satisfied: tqdm in /home/alpha/.local/lib/python3.10/site-packages (from deepspeed==0.8.0+aef8a856) (4.64.1)
Requirement already satisfied: typing-extensions>=4.2.0 in /home/alpha/.local/lib/python3.10/site-packages (from pydantic->deepspeed==0.8.0+aef8a856) (4.4.0)
Requirement already satisfied: networkx in /home/alpha/.local/lib/python3.10/site-packages (from torch->deepspeed==0.8.0+aef8a856) (3.0rc1)
Requirement already satisfied: pytorch-triton==2.0.0+0d7e753227 in /home/alpha/.local/lib/python3.10/site-packages (from torch->deepspeed==0.8.0+aef8a856) (2.0.0+0d7e753227)
Requirement already satisfied: sympy in /home/alpha/.local/lib/python3.10/site-packages (from torch->deepspeed==0.8.0+aef8a856) (1.11.1)
Requirement already satisfied: filelock in /home/alpha/.local/lib/python3.10/site-packages (from pytorch-triton==2.0.0+0d7e753227->torch->deepspeed==0.8.0+aef8a856) (3.9.0)
Requirement already satisfied: cmake in /home/alpha/.local/lib/python3.10/site-packages (from pytorch-triton==2.0.0+0d7e753227->torch->deepspeed==0.8.0+aef8a856) (3.25.0)
Requirement already satisfied: mpmath>=0.19 in /home/alpha/.local/lib/python3.10/site-packages (from sympy->torch->deepspeed==0.8.0+aef8a856) (1.2.1)
Building wheels for collected packages: deepspeed
Building wheel for deepspeed (setup.py) ... done
Created wheel for deepspeed: filename=deepspeed-0.8.0+aef8a856-py3-none-any.whl size=760411 sha256=716e9a79dd19196bd60eb5395673584ce1e8d363ad330e9d69a06f6ef65e6f89
Stored in directory: /tmp/pip-ephem-wheel-cache-oakoedrq/wheels/a2/ea/d8/a0a5ae4cd2516d6554e52b680c03214c5c4359a78b8309f8f1
Successfully built deepspeed
WARNING: Ignoring invalid distribution -orch (/home/alpha/.local/lib/python3.10/site-packages)
Installing collected packages: deepspeed
WARNING: Ignoring invalid distribution -orch (/home/alpha/.local/lib/python3.10/site-packages)
Successfully installed deepspeed-0.8.0+aef8a856
WARNING: Ignoring invalid distribution -orch (/home/alpha/.local/lib/python3.10/site-packages)
WARNING: Ignoring invalid distribution -orch (/home/alpha/.local/lib/python3.10/site-packages)
WARNING: Ignoring invalid distribution -orch (/home/alpha/.local/lib/python3.10/site-packages)
/tmp/DeepSpeed master 7s
❯ ds_report
Traceback (most recent call last):
File "/home/alpha/.local/bin/ds_report", line 3, in <module>
from deepspeed.env_report import cli_main
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/__init__.py", line 14, in <module>
from . import ops
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/__init__.py", line 1, in <module>
from . import adam
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/adam/__init__.py", line 2, in <module>
from .fused_adam import FusedAdam
File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/adam/fused_adam.py", line 13, in <module>
from deepspeed.ops.op_builder.builder_names import FusedAdamBuilder
ImportError: cannot import name 'FusedAdamBuilder' from 'deepspeed.ops.op_builder.builder_names' (/home/alpha/.local/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder_names.py)
Also this does appear to be some kind of regression, as the release build of deepspeed initializes without any errors.
I am trying to test this on Windows on my same machine, but am having some trouble with the dependencies (ninja, I would assume?):
C:\Users\Alpha\scratch\DeepSpeed>pip install .
Processing c:\users\alpha\scratch\deepspeed
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [13 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "C:\Users\Alpha\scratch\DeepSpeed\setup.py", line 156, in <module>
abort(f"Unable to pre-compile {op_name}")
File "C:\Users\Alpha\scratch\DeepSpeed\setup.py", line 48, in abort
assert False, msg
AssertionError: Unable to pre-compile async_io
DS_BUILD_OPS=1
←[93m [WARNING] ←[0m async_io requires the dev libaio .so object and headers but these were not found.
←[93m [WARNING] ←[0m If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
←[93m [WARNING] ←[0m One can disable async_io with DS_BUILD_AIO=0
←[31m [ERROR] ←[0m Unable to pre-compile async_io
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
Hmmm I just hit this as well with a pip install deepspeed
but
pip install deepspeed==0.7.7
worked
Hello @brucethemoose and @Data-drone. Thank you for reporting the error. I tried both aef8a85 and the current pypi version of deepspeed yet I cannot reproduce the error.
Could you try to cd
to a random dir other than the DeepSpeed dir and see if the error is still there?
@delock Do you have any hint of what might go wrong here?
@brucethemoose For the error you see when installing on Windows, that is expected as we don't support Windows for now.
To follow up, this should have been fixed by https://github.com/microsoft/DeepSpeed/pull/2677/commits/b587c7e85470329ac25df7c7c2521ff9b2833db7. If you still have such issues with the latest version of deepspeed, please feel free to reopen this issue
The latest git version of deepspeeds (aef8a85) builds and imports just fine, but trying to use it in pretty much anything results in an import error. For instance:
This seems to go back a few commits.
And why build from source? Well I am trying to figure out why
accelerate
's deepspeed stage 2 config is not working on the official (0.7.7) release in this repo: https://github.com/kohya-ss/sd-scripts/issues/63And was hoping the latest commit may fix something.