RuntimeError when running text to speech translation in demo/app.py

Description

After launching demo/app.py, I navigated to the local URL http://127.0.0.1:7860 using a web browser and selected T2ST as the task. I set the source language to English and the target language to French. In the input text box, I entered "hello world" and clicked the Translate button. However, instead of getting a translation, an error message was displayed on the webpage. Concurrently, a runtime exception was thrown in the command line interface (CLI) with the following traceback excerpt:

(seamless_communication) ✔ ~/git/seamless_communication/demo [main ↓·1|✚ 3]
09:27 $ python app.py
Using the cached checkpoint of the model 'seamlessM4T_large'. Set `force=True` to download again.
Using the cached tokenizer of the model 'seamlessM4T_large'. Set `force=True` to download again.
Using the cached checkpoint of the model 'vocoder_36langs'. Set `force=True` to download again.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Computations are being performed on: cuda:0
Traceback (most recent call last):
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/gradio/queueing.py", line 407, in call_prediction
    output = await route_utils.call_process_api(
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/gradio/route_utils.py", line 226, in call_process_api
    output = await app.get_blocks().process_api(
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/gradio/blocks.py", line 1550, in process_api
    result = await self.call_function(
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/gradio/blocks.py", line 1185, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/gradio/utils.py", line 661, in wrapper
    response = f(*args, **kwargs)
  File "/home/egasdad/git/seamless_communication/demo/app.py", line 349, in predict
    text_out, wav, sr = translator.predict(
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/seamless_communication/models/inference/translator.py", line 246, in predict
    wav_out = self.vocoder(units, tgt_lang, spkr, dur_prediction=True)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/seamless_communication/models/vocoder/vocoder.py", line 39, in forward
    return self.code_generator(x, dur_prediction)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/seamless_communication/models/vocoder/codehifigan.py", line 122, in forward
    log_dur_pred = self.dur_predictor(x.transpose(1, 2))
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/seamless_communication/models/vocoder/codehifigan.py", line 50, in forward
    x = self.conv1(x.transpose(1, 2)).transpose(1, 2)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 313, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/egasdad/miniconda3/envs/seamless_communication/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 309, in _conv_forward
    return F.conv1d(input, weight, bias, self.stride,
RuntimeError: GET was unable to find an engine to execute this computation

Steps to Reproduce

Clone the git repository: git clone <repository-url>
Create a conda environment: conda create --name seamless_communication python=3.9 and activate it.
Change the Gradio version to 3.48.0 in the app/requirements.txt file, as the latest version breaks demo/app.py.
Install PyTorch: pip3 install torch==2.0.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
Install the CUDA toolkit: conda install -c nvidia/label/cuda-11.7.0 cuda-toolkit
Install the repo: cd ~/git/seamless_communication/ && pip install
Install deps for the demo/app.py: pip install demo/requirements.txt

Launch the app: python demo/app.py

Additional information:


C:\Users\15144>wsl -l -v
NAME            STATE           VERSION
* Ubuntu-22.04    Running         2

(seamless_communication) ✔ ~/git/seamless_communication [main ↓·1|✚ 3] 09:51 $ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Jun__8_16:49:14_PDT_2022 Cuda compilation tools, release 11.7, V11.7.99 Build cuda_11.7.r11.7/compiler.31442593_0

(seamless_communication) ✘-1 ~/git/seamless_communication [main ↓·1|✚ 3] 09:57 $ python -c "import torch; print(f'PyTorch Version: {torch.version}, CUDA Version: {torch.version.cuda if torch.cuda.is_available() else \"CUDA not available\"}')" PyTorch Version: 2.0.1+cu117, CUDA Version: 11.7

+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 23 G /Xwayland N/A | +---------------------------------------------------------------------------------------+

facebookresearch / seamless_communication