collabora / WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.
https://collabora.github.io/WhisperSpeech/
MIT License
4.01k stars 218 forks source link

Colab notebook inference error #43

Closed regstuff closed 10 months ago

regstuff commented 10 months ago

Hello, I'm using the demo colab notebook (with T4 GPU) from the Main page and I get this error when running this cell:

# this is very slow right now since our inference code is not very optimized
# but even without this crucial optimization it is still better than real-time on an RTX 4090
pipe.generate_to_notebook("""
This is the first demo of Whisper Speech, a fully open source text-to-speech model trained by Collabora and Lion on the Juwels supercomputer.
""")

The error I get:

0.00% [0/749 00:00<?]
[2024-01-18 10:49:50,490] [0/1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
---------------------------------------------------------------------------
BackendCompilerFailed                     Traceback (most recent call last)
[<ipython-input-10-8f3d1d1ad737>](https://localhost:8080/#) in <cell line: 3>()
      1 # this is very slow right now since our inference code is not very optimized
      2 # but even without this crucial optimization it is still better than real-time on an RTX 4090
----> 3 pipe.generate_to_notebook("""
      4 This is the first demo of Whisper Speech, a fully open source text-to-speech model trained by Collabora and Lion on the Juwels supercomputer.
      5 """)

52 frames
[/usr/lib/python3.10/concurrent/futures/_base.py](https://localhost:8080/#) in __get_result(self)
    401         if self._exception:
    402             try:
--> 403                 raise self._exception
    404             finally:
    405                 # Break a reference cycle with the exception in self._exception

BackendCompilerFailed: backend='inductor' raised:
AssertionError: libcuda.so cannot found!

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True
grefael1 commented 10 months ago

Add this to solve the issue: !ldconfig /usr/lib64-nvidia

235 commented 10 months ago

This helps, ty! Should it be committed into the demo file?

Add this to solve the issue: !ldconfig /usr/lib64-nvidia

zoq commented 10 months ago

Thanks for opening the issue, we will update the notebook.

HazySkies commented 10 months ago

Attempted this fix to solve my (duplicate) issue (https://github.com/collabora/WhisperSpeech/issues/46) and resulted in a separate error during the inference steps. I'll also share here.

RuntimeError                              Traceback (most recent call last)

[<ipython-input-8-33b66061c992>](https://localhost:8080/#) in <cell line: 3>()
      1 # this is very slow right now since our inference code is not very optimized
      2 # but even without this crucial optimization it is still better than real-time on an RTX 4090
----> 3 pipe.generate_to_notebook("""
      4 This example uses the default model of Whisper Speech. More specifically S 2 A hyphen Q 4 hyphen tiny, english and polish model. This will then be passed into RVC to achieve a potentially cleaner result.
      5 """)

9 frames

[/usr/local/lib/python3.10/dist-packages/whisperspeech/modules.py](https://localhost:8080/#) in unembed(self, embs)
    317     def unembed(self, embs):
    318         if not self.training and self.merged_out is not None:
--> 319             return F.linear(embs, self.merged_out, self.bias_out) # embs @ self.merged_out + self.bias_out
    320 
    321         orig_embs = embs

RuntimeError: self and mat2 must have the same dtype, but got Float and Half

This also occurs during the voice cloning cells too.

jpc commented 10 months ago

Hey, thanks for noticing this and reaching out. I released an updated version without torch.compile that should work in Colab.

HazySkies commented 10 months ago

It's working perfectly now, thanks.