Closed flandore closed 10 months ago
Suggestion: use triple ticks rather than single tick for multi line monospaced formatting.
@flandore I'm unable to reproduce this issue on my local system. Based on posts about similar nvcc compilation errors (i.e., error: parameter packs not expanded with ‘...’
), it may be resolved by upgrading nvcc/cuda. The versions I'm using are:
CUDA Version: 11.8
Driver Version: 520.56.06
Cuda compilation tools, release 11.6, V11.6.124
Additionally, I see a mismatch between torch compiled version and installed CUDA version:
Installed CUDA version 11.5 does not match the version torch was compiled with 11.6 but since the APIs are compatible, accepting this combination
This usually isn't a problem, but the output doesn't give any other indication of what the issue could be.
Can you try updating to a more recent version and try again?
@mrwyattii:
/usr/bin/cc
, /usr/bin/c++
. clang or GCC?)I had similar problems when trying to install DeepSpeed.
@augmented-fog using gcc 9.4.0
@augmented-fog using gcc 9.4.0
So, gcc 9.4.0 is your default compiler for C and C++, and you don't have LLVM or clang installed?
@flandore and @augmented-fog - are we safe to close this issue now that it is fairly old and issues appear to be resolved?
可能是文件名称的原因,换个文件名试试,不要使用deepspeed.py
hai
If I do pip install deepspeed, it installs. But when I try to run it, the code itself works, but the deepspeed efficiencies, etc. don't seem to work. I also get this message at the end
/home/ub_flan/bert/bert/bin/python3: Error while finding module specification for 'deepspeed.launcher.launch' (ModuleNotFoundError: No module named 'deepspeed.launcher'; 'deepspeed' is not a package)
I assume this is happening because the "installed" field is set to NOI added DS_BUILD_OPS=1 to make the installed field [YES]. DS_BUILD_OPS=1 gives a lot of errors
this is (DS_BUILD_OPS=1 pip install deepspeed) error
this is my env
$ ds_report(that I typed in the case of pip install deepspeed)
DeepSpeed C++/CUDA extension op report
NOTE: Ops not installed will be just-in-time (JIT) compiled at runtime if needed. Op compatibility means that your system meet the required dependencies to JIT install the op.
JIT compiled ops requires ninja ninja .................. [OKAY]
op name ................ installed .. compatible
cpu_adam ............... [NO] ....... [OKAY] cpu_adagrad ............ [NO] ....... [OKAY] fused_adam ............. [NO] ....... [OKAY] fused_lamb ............. [NO] ....... [OKAY] sparse_attn ............ [NO] ....... [OKAY] transformer ............ [NO] ....... [OKAY] stochastic_transformer . [NO] ....... [OKAY] async_io ............... [NO] ....... [OKAY] utils .................. [NO] ....... [OKAY] quantizer .............. [NO] ....... [OKAY] transformer_inference .. [NO] ....... [OKAY] spatial_inference ...... [NO] ....... [OKAY]
DeepSpeed general environment info: torch install path ............... ['/home/ub_flan/bert/bert/lib/python3.10/site-packages/torch'] torch version .................... 1.13.1+cu116 torch cuda version ............... 11.6 torch hip version ................ None nvcc version ..................... 11.5 deepspeed install path ........... ['/home/ub_flan/bert/bert/lib/python3.10/site-packages/deepspeed'] deepspeed info ................... 0.7.7, unknown, unknown deepspeed wheel compiled w. ...... torch 1.13, cuda 11.6
os ubuntu
$nvidia-smi Mon Jan 16 12:21:56 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.65.01 Driver Version: 516.94 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... On | 00000000:08:00.0 On | N/A | | 40% 29C P8 24W / 250W | 586MiB / 11264MiB | 1% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+