In Docker running without CUDA, gives CUDA-related error

jfhc commented 2 weeks ago

In Windows, running docker run -e COQUI_TOS_AGREED=1 -v D:/.local/share/tts:/root/.local/share/tts -v ${PWD}:/root -w /root ghcr.io/aedocw/epub2tts:release 'The Emergence of Social Space_ - Kristin Ross.epub' --engine xtts --speaker "Royston Min", it says "Using CPU" but then fails because the CUDA_HOME variable is not set. How can I address this?

Not enough VRAM on GPU or CUDA not found. Using CPU
Loading model: /root/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2
 > Downloading model to /root/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2
100%|██████████| 1.87G/1.87G [02:38<00:00, 11.8MiB/s]
100%|██████████| 4.37k/4.37k [00:00<00:00, 5.86kiB/s]
100%|██████████| 361k/361k [00:00<00:00, 413kiB/s]
100%|██████████| 32.0/32.0 [00:00<00:00, 32.7iB/s]
100%|██████████| 7.75M/7.75M [00:18<00:00, 13.2MiB/s] > Model's license - CPML
 > Check https://coqui.ai/cpml.txt for more info.
 > Using model: xtts
[2024-06-15 19:35:18,347] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-06-15 19:35:19,205] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.12.6, git-hash=unknown, git-branch=unknown
[2024-06-15 19:35:19,208] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter replace_method is deprecated. This parameter is no longer needed, please remove from your call to DeepSpeed-inference
[2024-06-15 19:35:19,209] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[2024-06-15 19:35:19,210] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
Using /root/.cache/torch_extensions/py310_cu121 as PyTorch extensions root...
Creating extension directory /root/.cache/torch_extensions/py310_cu121/transformer_inference...
Detected CUDA files, patching ldflags
Traceback (most recent call last):
  File "/opt/epub2tts/epub2tts.py", line 746, in <module>
    main()
  File "/opt/epub2tts/epub2tts.py", line 735, in main
    mybook.read_book(
  File "/opt/epub2tts/epub2tts.py", line 384, in read_book
    self.model.load_checkpoint(
  File "/usr/local/lib/python3.10/dist-packages/TTS/tts/models/xtts.py", line 783, in load_checkpoint
    self.gpt.init_gpt_for_inference(kv_cache=self.args.kv_cache, use_deepspeed=use_deepspeed)
  File "/usr/local/lib/python3.10/dist-packages/TTS/tts/layers/xtts/gpt.py", line 224, in init_gpt_for_inference
    self.ds_engine = deepspeed.init_inference(
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/__init__.py", line 342, in init_inference
    engine = InferenceEngine(model, config=ds_inference_config)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/engine.py", line 158, in __init__
    self._apply_injection_policy(config)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/engine.py", line 418, in _apply_injection_policy
    replace_transformer_layer(client_module, self.module, checkpoint, config, self.config)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/module_inject/replace_module.py", line 342, in replace_transformer_layer
    replaced_module = replace_module(model=model,
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/module_inject/replace_module.py", line 586, in replace_module
    replaced_module, _ = _replace_module(model, policy, state_dict=sd)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/module_inject/replace_module.py", line 646, in _replace_module
    _, layer_id = _replace_module(child,
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/module_inject/replace_module.py", line 646, in _replace_module
    _, layer_id = _replace_module(child,
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/module_inject/replace_module.py", line 622, in _replace_module
    replaced_module = policies[child.__class__][0](child,
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/module_inject/replace_module.py", line 298, in replace_fn
    new_module = replace_with_policy(child,
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/module_inject/replace_module.py", line 247, in replace_with_policy
    _container.create_module()
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/module_inject/containers/gpt2.py", line 20, in create_module
    self.module = DeepSpeedGPTInference(_config, mp_group=self.mp_group)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/model_implementations/transformers/ds_gpt.py", line 20, in __init__
    super().__init__(config, mp_group, quantize_scales, quantize_groups, merge_count, mlp_extra_grouping)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 58, in __init__
    inference_module = builder.load()
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/ops/op_builder/builder.py", line 458, in load
    return self.jit_load(verbose)
  File "/usr/local/lib/python3.10/dist-packages/deepspeed/ops/op_builder/builder.py", line 502, in jit_load
    op_module = load(name=self.name,
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1308, in load
    return _jit_compile(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1800, in _write_ninja_file_and_build_library
    extra_ldflags = _prepare_ldflags(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1899, in _prepare_ldflags
    if (not os.path.exists(_join_cuda_home(extra_lib_dir)) and
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 2416, in _join_cuda_home
    raise OSError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

aedocw commented 2 weeks ago

I started to work on a CUDA-ready docker container (https://github.com/aedocw/epub2tts/blob/main/Dockerfile.cuda12) but it didn't work as expected, and I did not have a good test environment to go further unfortunately. Maybe someone else here will have answers but I wanted to let you know I'm not going to be able to address this, so it would have to be someone else.

Your best bet would be to run this in a python virtual environment in WSL.

jfhc commented 2 weeks ago

Thanks, @aedocw - I have got it working in WSL now with the default model, but xtts still not working. I will raise another issue maybe

aedocw / epub2tts

In Docker running without CUDA, gives CUDA-related error #247