neonbjb / tortoise-tts

A multi-voice TTS system trained with an emphasis on quality
Apache License 2.0
12.85k stars 1.78k forks source link

Absolute fastest inference speed #574

Open SinanAkkoyun opened 1 year ago

SinanAkkoyun commented 1 year ago

Hello! Thank you so much for all the great work. I would really love to break free from the 11labs API, the only thing tortoise does not have is the sub 1 second inference speed.

With deepspeed, half precision, kv_cache, 1 candidate and one sentence prompt, the best I could get out of it is 2.4 seconds (without warmup, 4 seconds). The deepspeed promises a 10x speedup, is that relative to base performance or just not applicable for one sentence and is that result expected for a 3090?

I would love to know how to further increase inference speed to sub 1 second performance, thank you :)

SinanAkkoyun commented 1 year ago

From tutorials I've seen that adding a new preset with setting the num_autoregressive_samples for finetuned voices yields much greater speeds with acceptable quality. However, it seems to not generate clip results:

model loaded
Generating autoregressive samples..
0it [00:00, ?it/s]
Computing best candidates using CLVP
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/home/ubuntu/ml/speech/tts/tortoise/tortoise-tts/tortoise/do_tts.py", line 46, in <module>
    gen, dbg_state = tts.tts_with_preset(args.text, k=args.candidates, voice_samples=voice_samples, conditioning_latents=conditioning_latents,
  File "/home/ubuntu/ml/speech/tts/tortoise/tortoise-tts/tortoise/api.py", line 347, in tts_with_preset
    return self.tts(text, **settings)
  File "/home/ubuntu/ml/speech/tts/tortoise/tortoise-tts/tortoise/api.py", line 490, in tts
    clip_results = torch.cat(clip_results, dim=0)
RuntimeError: torch.cat(): expected a non-empty list of Tensors

If nothing else can speed it up, I would be very glad about any help regarding lowering the samples

ADD-eNavarro commented 1 year ago

What bout this? Some polish students have improved Tortoise speed by distilling the models into one and then distilling that one even more, but I can't find anything related to this. Could be interesting to contact them and try their distilled model, maybe?

SinanAkkoyun commented 1 year ago

That's awesome, thank you so so much! I will try to contact them and hope to maybe achieve distillation myself, but I am not proficient enough to do so myself and would still love to have some easy hyperparameter tuning for speeding tortoise up even more :)

SpaceCowboy850 commented 1 year ago

Please update if you find a way to improve inference speed. Quality is great, but speed is definitely a problem

MarkMLCode commented 10 months ago

If you want to improve inference speed a bit, I created a fork that allows for a slight speedup in exchange for using more memory. It probably won't bring you under 1 sec, but I've seen speeds of about 1.3 sec on ultra-fast (I do have a 4090 tho). You just need to use 'device_only=True' when creating the TTS object.

https://github.com/neonbjb/tortoise-tts/pull/628

manmay-nakhashi commented 10 months ago

Hey , did you check out the new api_fast ?

ekarmazin commented 10 months ago

@manmay-nakhashi I have tried it and got this error:

ValueError: The following `model_kwargs` are not used by the model: ['cond_free_k', 'diffusion_temperature', 'diffusion_iterations'] (note: typos in the generate arguments will also show up in this list)

Any suggestions?

manmay-nakhashi commented 10 months ago

Don't pass diffusion related args as it's not using diffusion

ekarmazin commented 10 months ago

I am not passing those, but using the tts_with_preset which seems like do that: https://github.com/neonbjb/tortoise-tts/blob/80f89987a5abda5e2b082618cd74f9c7411141dc/tortoise/api_fast.py#L257C9-L257C24

So I am using it like:

from tortoise.api_fast import TextToSpeech

# Initialize the TextToSpeech object
tts = TextToSpeech(kv_cache=True, use_deepspeed=True, half=True)

# Create an in-memory buffer to hold the WAV file data
buffer = io.BytesIO()

# Initialize the WAV file
wf = wave.open(buffer, 'wb')
wf.setnchannels(1)  # Mono
wf.setsampwidth(2)  # 16-bit audio
wf.setframerate(24000)  # Sample rate

for audio_frame in tts.tts_with_preset(
        text_chunk,
        voice_samples=voice_samples,
        preset="ultra_fast",
):
    if audio_frame is not None:
        audio_np = audio_frame.cpu().detach().numpy()
        audio_int16 = (audio_np * 32767).astype(np.int16)
        wf.writeframes(audio_int16.tobytes())
    else:
        logging.warning("No audio generated for the text chunk.")
manmay-nakhashi commented 10 months ago

no need to use presets over here as there are different configurations for speed vs. quality balance.

ekarmazin commented 10 months ago

Got it. Yeah switched to tts_stream and it is super fast! Thank you!

eschmidbauer commented 10 months ago

I'm still not able to get deepspeed to work. I get this error

[8/9] c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/includes -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/THC -isystem /home/user/miniconda3/envs/tortoise/include -isystem /home/user/miniconda3/envs/tortoise/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++14 -g -Wno-reorder -c /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o
FAILED: pt_binding.o
c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/includes -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/THC -isystem /home/user/miniconda3/envs/tortoise/include -isystem /home/user/miniconda3/envs/tortoise/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++14 -g -Wno-reorder -c /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/string_view.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/StringUtil.h:6,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/Exception.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/Device.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/impl/InlineDeviceGuard.h:6,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/DeviceGuard.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/cuda/CUDAStream.h:8,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:5:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/C++17.h:27:2: error: #error You need C++17 to compile PyTorch
   27 | #error You need C++17 to compile PyTorch
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4:2: error: #error C++17 or later compatible compiler is required to use PyTorch.
    4 | #error C++17 or later compatible compiler is required to use PyTorch.
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:9,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/ATen.h:4:2: error: #error C++17 or later compatible compiler is required to use ATen.
    4 | #error C++17 or later compatible compiler is required to use ATen.
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue.h:1499,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/List_inl.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/List.h:490,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/IListRef_inl.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/IListRef.h:632,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/WrapDimUtils.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/TensorNames.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/NamedTensorUtils.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/variable.h:11,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/autograd.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h: In lambda function:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:30: error: ‘is_convertible_v’ is not a member of ‘std’; did you mean ‘is_convertible’?
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                              ^~~~~~~~~~~~~~~~
      |                              is_convertible
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:91: error: expected ‘(’ before ‘,’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                           ^
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:111: error: expected primary-expression before ‘>’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                                               ^
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:112: error: expected primary-expression before ‘)’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                                                ^
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/KernelFunction_impl.h:1,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/KernelFunction.h:251,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/op_registration/op_registration.h:11,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/library.h:68,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/autograd_not_implemented_fallback.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h: In static member function ‘static Result c10::impl::BoxedKernelWrapper<Result(Args ...), typename std::enable_if<((c10::guts::conjunction<c10::guts::disjunction<std::is_constructible<c10::IValue, typename std::decay<Args>::type>, std::is_same<c10::TensorOptions, typename std::decay<Args>::type> >...>::value && c10::guts::conjunction<c10::guts::disjunction<c10::impl::has_ivalue_to<T, void>, std::is_same<void, ReturnType> >, c10::guts::negation<std::is_lvalue_reference<_Tp> > >::value) && (! c10::impl::is_tuple_of_mutable_tensor_refs<Result>::value)), void>::type>::call(const c10::BoxedKernel&, const c10::OperatorHandle&, c10::DispatchKeySet, Args ...)’:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:25: error: ‘is_same_v’ is not a member of ‘std’; did you mean ‘is_same’?
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                         ^~~~~~~~~
      |                         is_same
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:35: error: expected primary-expression before ‘void’
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                                   ^~~~
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:35: error: expected ‘)’ before ‘void’
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:18: note: to match this ‘(’
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                  ^
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/tortoise-tts/tortoise/do_tts.py", line 31, in <module>
    tts = TextToSpeech(models_dir=args.model_dir, use_deepspeed=args.use_deepspeed, kv_cache=args.kv_cache, half=args.half)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/tortoise-tts/tortoise/api.py", line 218, in __init__
    self.autoregressive.post_init_gpt2_config(use_deepspeed=use_deepspeed, kv_cache=kv_cache, half=self.half)
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/tortoise_tts-3.0.0-py3.11.egg/tortoise/models/autoregressive.py", line 381, in post_init_gpt2_config
    self.ds_engine = deepspeed.init_inference(model=self.inference_model,
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/__init__.py", line 311, in init_inference
    engine = InferenceEngine(model, config=ds_inference_config)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/inference/engine.py", line 136, in __init__
    self._apply_injection_policy(config)
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/inference/engine.py", line 363, in _apply_injection_policy
    replace_transformer_layer(client_module,
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 534, in replace_transformer_layer
    replaced_module = replace_module(model=model,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 799, in replace_module
    replaced_module, _ = _replace_module(model, policy)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 826, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 826, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 816, in _replace_module
    replaced_module = policies[child.__class__][0](child,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 524, in replace_fn
    new_module = replace_with_policy(child,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 385, in replace_with_policy
    _container.create_module()
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/containers/gpt2.py", line 16, in create_module
    self.module = DeepSpeedGPTInference(_config, mp_group=self.mp_group)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/model_implementations/transformers/ds_gpt.py", line 18, in __init__
    super().__init__(config,
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 53, in __init__
    inference_cuda_module = builder.load()
                            ^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/op_builder/builder.py", line 485, in load
    return self.jit_load(verbose)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/op_builder/builder.py", line 520, in jit_load
    op_module = load(
                ^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1308, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'transformer_inference'

These are the exact steps im following:

conda create --name tortoise python=3.11
conda activate tortoise

conda install -c "nvidia/label/cuda-12.1.0" \
    cuda cuda-toolkit cuda-compiler

git clone https://github.com/neonbjb/tortoise-tts.git
cd tortoise-tts
pip install -r requirements.txt
python setup.py install

python tortoise/do_tts.py \
    --text "This is a test of the initial setup. This is only a test." \
    --use_deepspeed true \
    --voice random --preset fast
UltramanKuz commented 8 months ago

I'm still not able to get deepspeed to work. I get this error

[8/9] c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/includes -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/THC -isystem /home/user/miniconda3/envs/tortoise/include -isystem /home/user/miniconda3/envs/tortoise/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++14 -g -Wno-reorder -c /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o
FAILED: pt_binding.o
c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/includes -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/THC -isystem /home/user/miniconda3/envs/tortoise/include -isystem /home/user/miniconda3/envs/tortoise/include/python3.11 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++14 -g -Wno-reorder -c /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/string_view.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/StringUtil.h:6,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/Exception.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/Device.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/impl/InlineDeviceGuard.h:6,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/core/DeviceGuard.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/cuda/CUDAStream.h:8,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:5:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/c10/util/C++17.h:27:2: error: #error You need C++17 to compile PyTorch
   27 | #error You need C++17 to compile PyTorch
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4:2: error: #error C++17 or later compatible compiler is required to use PyTorch.
    4 | #error C++17 or later compatible compiler is required to use PyTorch.
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:9,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/ATen.h:4:2: error: #error C++17 or later compatible compiler is required to use ATen.
    4 | #error C++17 or later compatible compiler is required to use ATen.
      |  ^~~~~
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue.h:1499,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/List_inl.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/List.h:490,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/IListRef_inl.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/IListRef.h:632,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/WrapDimUtils.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/TensorNames.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/NamedTensorUtils.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/variable.h:11,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/autograd.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h: In lambda function:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:30: error: ‘is_convertible_v’ is not a member of ‘std’; did you mean ‘is_convertible’?
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                              ^~~~~~~~~~~~~~~~
      |                              is_convertible
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:91: error: expected ‘(’ before ‘,’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                           ^
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:111: error: expected primary-expression before ‘>’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                                               ^
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/ivalue_inl.h:1061:112: error: expected primary-expression before ‘)’ token
 1061 |         if constexpr (::std::is_convertible_v<typename c10::invoke_result_t<T &&, Future&>, IValueWithStorages>) {
      |                                                                                                                ^
In file included from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/KernelFunction_impl.h:1,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/KernelFunction.h:251,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/op_registration/op_registration.h:11,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/library.h:68,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/autograd/autograd_not_implemented_fallback.h:3,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:4,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/torch/extension.h:5,
                 from /home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp:6:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h: In static member function ‘static Result c10::impl::BoxedKernelWrapper<Result(Args ...), typename std::enable_if<((c10::guts::conjunction<c10::guts::disjunction<std::is_constructible<c10::IValue, typename std::decay<Args>::type>, std::is_same<c10::TensorOptions, typename std::decay<Args>::type> >...>::value && c10::guts::conjunction<c10::guts::disjunction<c10::impl::has_ivalue_to<T, void>, std::is_same<void, ReturnType> >, c10::guts::negation<std::is_lvalue_reference<_Tp> > >::value) && (! c10::impl::is_tuple_of_mutable_tensor_refs<Result>::value)), void>::type>::call(const c10::BoxedKernel&, const c10::OperatorHandle&, c10::DispatchKeySet, Args ...)’:
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:25: error: ‘is_same_v’ is not a member of ‘std’; did you mean ‘is_same’?
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                         ^~~~~~~~~
      |                         is_same
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:35: error: expected primary-expression before ‘void’
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                                   ^~~~
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:35: error: expected ‘)’ before ‘void’
/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:229:18: note: to match this ‘(’
  229 |     if constexpr (!std::is_same_v<void, Result>) {
      |                  ^
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/tortoise-tts/tortoise/do_tts.py", line 31, in <module>
    tts = TextToSpeech(models_dir=args.model_dir, use_deepspeed=args.use_deepspeed, kv_cache=args.kv_cache, half=args.half)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/tortoise-tts/tortoise/api.py", line 218, in __init__
    self.autoregressive.post_init_gpt2_config(use_deepspeed=use_deepspeed, kv_cache=kv_cache, half=self.half)
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/tortoise_tts-3.0.0-py3.11.egg/tortoise/models/autoregressive.py", line 381, in post_init_gpt2_config
    self.ds_engine = deepspeed.init_inference(model=self.inference_model,
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/__init__.py", line 311, in init_inference
    engine = InferenceEngine(model, config=ds_inference_config)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/inference/engine.py", line 136, in __init__
    self._apply_injection_policy(config)
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/inference/engine.py", line 363, in _apply_injection_policy
    replace_transformer_layer(client_module,
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 534, in replace_transformer_layer
    replaced_module = replace_module(model=model,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 799, in replace_module
    replaced_module, _ = _replace_module(model, policy)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 826, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 826, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 816, in _replace_module
    replaced_module = policies[child.__class__][0](child,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 524, in replace_fn
    new_module = replace_with_policy(child,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/replace_module.py", line 385, in replace_with_policy
    _container.create_module()
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/module_inject/containers/gpt2.py", line 16, in create_module
    self.module = DeepSpeedGPTInference(_config, mp_group=self.mp_group)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/model_implementations/transformers/ds_gpt.py", line 18, in __init__
    super().__init__(config,
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 53, in __init__
    inference_cuda_module = builder.load()
                            ^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/op_builder/builder.py", line 485, in load
    return self.jit_load(verbose)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/op_builder/builder.py", line 520, in jit_load
    op_module = load(
                ^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1308, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'transformer_inference'

These are the exact steps im following:

conda create --name tortoise python=3.11
conda activate tortoise

conda install -c "nvidia/label/cuda-12.1.0" \
  cuda cuda-toolkit cuda-compiler

git clone https://github.com/neonbjb/tortoise-tts.git
cd tortoise-tts
pip install -r requirements.txt
python setup.py install

python tortoise/do_tts.py \
  --text "This is a test of the initial setup. This is only a test." \
  --use_deepspeed true \
  --voice random --preset fast

I meet the same issue

manmay-nakhashi commented 8 months ago

Run it without Deepspeed, deepspeed works well with cuda 11.8. cuda should be compiled with nvcc.

jason-shen commented 3 months ago

Interesting to know using tts_stream, what sort of result did you get? Whats the process ratio? Thanks advance