Open trphoenix opened 1 year ago
`(cpmbee) aiuser@aiuser-virtual-machine:~/worker/CPM-Bee/src$ pip install bmtrain --no-cache-dir Collecting bmtrain Downloading bmtrain-0.2.2.tar.gz (58 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.7/58.7 kB 131.8 kB/s eta 0:00:00 Preparing metadata (setup.py) ... done Requirement already satisfied: numpy in /home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages (from bmtrain) (1.24.1) Building wheels for collected packages: bmtrain Building wheel for bmtrain (setup.py) ... error error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [100 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-310
creating build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/param_init.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/global_var.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/synchronize.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/parameter.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/store.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/checkpointing.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/wrapper.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/layer.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/debug.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/utils.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/block_layer.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/pipe_layer.py -> build/lib.linux-x86_64-cpython-310/bmtrain
creating build/lib.linux-x86_64-cpython-310/bmtrain/optim
copying bmtrain/optim/optim_manager.py -> build/lib.linux-x86_64-cpython-310/bmtrain/optim
copying bmtrain/optim/adam.py -> build/lib.linux-x86_64-cpython-310/bmtrain/optim
copying bmtrain/optim/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain/optim
copying bmtrain/optim/adam_offload.py -> build/lib.linux-x86_64-cpython-310/bmtrain/optim
creating build/lib.linux-x86_64-cpython-310/bmtrain/nccl
copying bmtrain/nccl/enums.py -> build/lib.linux-x86_64-cpython-310/bmtrain/nccl
copying bmtrain/nccl/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain/nccl
creating build/lib.linux-x86_64-cpython-310/bmtrain/distributed
copying bmtrain/distributed/ops.py -> build/lib.linux-x86_64-cpython-310/bmtrain/distributed
copying bmtrain/distributed/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain/distributed
creating build/lib.linux-x86_64-cpython-310/bmtrain/inspect
copying bmtrain/inspect/format.py -> build/lib.linux-x86_64-cpython-310/bmtrain/inspect
copying bmtrain/inspect/model.py -> build/lib.linux-x86_64-cpython-310/bmtrain/inspect
copying bmtrain/inspect/tensor.py -> build/lib.linux-x86_64-cpython-310/bmtrain/inspect
copying bmtrain/inspect/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain/inspect
creating build/lib.linux-x86_64-cpython-310/bmtrain/loss
copying bmtrain/loss/cross_entropy.py -> build/lib.linux-x86_64-cpython-310/bmtrain/loss
copying bmtrain/loss/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain/loss
creating build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/all_gather.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/send_recv.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/reduce_scatter.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/shape.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/utils.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
creating build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/exponential.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/linear.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/no_decay.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/noam.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/warmup.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/cosine.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
running build_ext
Traceback (most recent call last):
File "
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for bmtrain Running setup.py clean for bmtrain Failed to build bmtrain Installing collected packages: bmtrain Running setup.py install for bmtrain ... error error: subprocess-exited-with-error
× Running setup.py install for bmtrain did not run successfully. │ exit code: 1 ╰─> [115 lines of output] running install /home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated. !!
********************************************************************************
Please avoid running ``setup.py`` directly.
Instead, use pypa/build, pypa/installer, pypa/build or
other standards-based tools.
See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
********************************************************************************
!!
self.initialize_options()
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-310
creating build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/init.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/param_init.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/global_var.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/synchronize.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/parameter.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/store.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/checkpointing.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/wrapper.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/layer.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/debug.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/utils.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/block_layer.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/__init__.py -> build/lib.linux-x86_64-cpython-310/bmtrain
copying bmtrain/pipe_layer.py -> build/lib.linux-x86_64-cpython-310/bmtrain
creating build/lib.linux-x86_64-cpython-310/bmtrain/optim
copying bmtrain/optim/optim_manager.py -> build/lib.linux-x86_64-cpython-310/bmtrain/optim
copying bmtrain/optim/adam.py -> build/lib.linux-x86_64-cpython-310/bmtrain/optim
copying bmtrain/optim/__init__.py -> build/lib.linux-x86_64-cpython-310/bmtrain/optim
copying bmtrain/optim/adam_offload.py -> build/lib.linux-x86_64-cpython-310/bmtrain/optim
creating build/lib.linux-x86_64-cpython-310/bmtrain/nccl
copying bmtrain/nccl/enums.py -> build/lib.linux-x86_64-cpython-310/bmtrain/nccl
copying bmtrain/nccl/__init__.py -> build/lib.linux-x86_64-cpython-310/bmtrain/nccl
creating build/lib.linux-x86_64-cpython-310/bmtrain/distributed
copying bmtrain/distributed/ops.py -> build/lib.linux-x86_64-cpython-310/bmtrain/distributed
copying bmtrain/distributed/__init__.py -> build/lib.linux-x86_64-cpython-310/bmtrain/distributed
creating build/lib.linux-x86_64-cpython-310/bmtrain/inspect
copying bmtrain/inspect/format.py -> build/lib.linux-x86_64-cpython-310/bmtrain/inspect
copying bmtrain/inspect/model.py -> build/lib.linux-x86_64-cpython-310/bmtrain/inspect
copying bmtrain/inspect/tensor.py -> build/lib.linux-x86_64-cpython-310/bmtrain/inspect
copying bmtrain/inspect/__init__.py -> build/lib.linux-x86_64-cpython-310/bmtrain/inspect
creating build/lib.linux-x86_64-cpython-310/bmtrain/loss
copying bmtrain/loss/cross_entropy.py -> build/lib.linux-x86_64-cpython-310/bmtrain/loss
copying bmtrain/loss/__init__.py -> build/lib.linux-x86_64-cpython-310/bmtrain/loss
creating build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/all_gather.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/send_recv.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/reduce_scatter.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/shape.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/utils.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
copying bmtrain/benchmark/__init__.py -> build/lib.linux-x86_64-cpython-310/bmtrain/benchmark
creating build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/exponential.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/linear.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/no_decay.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/noam.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/warmup.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/cosine.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
copying bmtrain/lr_scheduler/__init__.py -> build/lib.linux-x86_64-cpython-310/bmtrain/lr_scheduler
running build_ext
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-8x63jkgl/bmtrain_0e1b06e1b3f94924861318fd74027a6e/setup.py", line 74, in <module>
setup(
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/__init__.py", line 107, in setup
return distutils.core.setup(**attrs)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/dist.py", line 1244, in run_command
super().run_command(command)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/command/install.py", line 74, in run
return orig.install.run(self)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/command/install.py", line 697, in run
self.run_command('build')
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/dist.py", line 1244, in run_command
super().run_command(command)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/dist.py", line 1244, in run_command
super().run_command(command)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
_build_ext.run(self)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
self.build_extensions()
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 499, in build_extensions
_check_cuda_version(compiler_name, compiler_version)
File "/home/aiuser/anaconda3/envs/cpmbee/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 387, in _check_cuda_version
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (12.1) mismatches the version that was used to compile
PyTorch (11.8). Please make sure to use the same CUDA versions.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure
× Encountered error while trying to install package. ╰─> bmtrain
note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure. (cpmbee) aiuser@aiuser-virtual-machine:~/worker/CPM-Bee/src$ `
我的显卡 (cpmbee) aiuser@aiuser-virtual-machine:~/worker/CPM-Bee/src$ nvidia-smi Thu Jun 1 15:06:45 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA RTX A4000 Off| 00000000:03:00.0 Off | Off | | 38% 58C P0 40W / 140W| 0MiB / 16376MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 1 NVIDIA RTX A4000 Off| 00000000:13:00.0 Off | Off | | 39% 58C P0 36W / 140W| 0MiB / 16376MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+ (cpmbee) aiuser@aiuser-virtual-machine:~/worker/CPM-Bee/src$
BMTrain适配CUDA 12的工作正在进行
试试这个,TORCH_CUDA_ARCH_LIST="7.5" pip install bmtrain==0.2.1
试试这个,TORCH_CUDA_ARCH_LIST="7.5" pip install bmtrain==0.2.1
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (12.1) mismatches the version that was used to compile
PyTorch (11.7). Please make sure to use the same CUDA versions.
试试这个,TORCH_CUDA_ARCH_LIST="7.5" pip install bmtrain==0.2.1
没用,一样报错,bmtrain与cuda 12不兼容。
放弃了
` RuntimeError: The detected CUDA version (12.1) mismatches the version that was used to compile PyTorch (11.8). Please make sure to use the same CUDA versions.
我的电脑上的cuda版本
nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Mon_Apr__3_17:16:06_PDT_2023 Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0 ` 这台电脑上还有其它项目,难道,要我为了这一个项目降版本么, 能否修懒得说那里让这个项目支持12可能性