Open Qinger27 opened 6 months ago
I have the same issue. Have you solved it?
maybe you need to update your gcc version, this could solve this problem.
请问你解决了吗?
解决了,通过把gcc版本提高
Hi,
I solved this problem by re-specifying the torch versions in ~/.bashrc using export LIBRARY_PATH=${you cuda lib64 path}:$LIBRARY_PATH
instead of $LD_LIBRARY_PATH. A bit wield.
下面是报错信息,可以帮我看看吗?
ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build subprocess.run( File "/dockerdata/graceqwang/videollava/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/dockerdata/graceqwang/my_code/Video-LLaVA/videollava/train/train_mem.py", line 12, in
train()
File "/dockerdata/graceqwang/my_code/Video-LLaVA/videollava/train/train.py", line 1074, in train
trainer.train()
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/transformers/trainer.py", line 1539, in train
return inner_training_loop(
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/transformers/trainer.py", line 1656, in _inner_training_loop
model, self.optimizer = self.accelerator.prepare(self.model, self.optimizer)
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/accelerate/accelerator.py", line 1198, in prepare
result = self._prepare_deepspeed(*args)
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/accelerate/accelerator.py", line 1531, in _prepare_deepspeed
optimizer = DeepSpeedCPUAdam(optimizer.param_groups, **defaults)
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in init
self.ds_opt_adam = CPUAdamBuilder().load()
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 454, in load
return self.jit_load(verbose)
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 497, in jit_load
op_module = load(name=self.name,
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
_write_ninja_file_and_build_library(
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'cpu_adam'
Exception ignored in: <function DeepSpeedCPUAdam.del at 0x7f0e48eba4d0>
Traceback (most recent call last):
File "/dockerdata/graceqwang/videollava/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in del
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'