rsxdalv / tts-generation-webui

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
https://rsxdalv.github.io/tts-generation-webui/
MIT License
1.46k stars 160 forks source link

Torch version mixup? #269

Closed tilllt closed 4 months ago

tilllt commented 5 months ago

Hey,

i am trying to get tts-generation-webui up and running, preferably as a docker container, but a lot of things dont seem to work. In my research there seem to be a version mixup in what Torch / PyTorch Version some of the tools expect (2.1.0) and what version is in fact installed (2.0.0) by default.

am i doing something wrong or is this project a little "broken" at the moment?

Cheers

rsxdalv commented 5 months ago

I'm not sure that a different environment would cause tested torch versions to not work. You can install torch 2.1.0 if you want Audiocraft to not complain, but every other project specifies torch 2.0.0. Meanwhile MPS/CPU folks generally are recommend to install latest/nightly versions.

Are you doing something specific? I tested only the "normal" setup to see if Audiocraft is still 2.0.0 compatible, with an older NVidia GPU, Windows and no extreme parameters (temp 0 etc, although most models prohibit them already).

As for - if it's tested why does it still give an error thing - because pip is just not friendly. Probably a few years down the road there will be a hundred "helpful errors".

On Mon, Jan 29, 2024, 3:48 PM tilllt @.***> wrote:

Hey,

i am trying to get tts-generation-webui up and running, preferably as a docker container, but a lot of things dont seem to work. In my research there seem to be a version mixup in what Torch / PyTorch Version some of the tools expect (2.1.0) and what version is in fact installed (2.0.0) by default.

am i doing something wrong or is this project a little "broken" at the moment?

Cheers

— Reply to this email directly, view it on GitHub https://github.com/rsxdalv/tts-generation-webui/issues/269, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTRXI4DZPC3QH5HT77W3XDYQ5H5HAVCNFSM6AAAAABCO6LSRSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEYDIOBWGE3TMMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tilllt commented 4 months ago

Hey,

i dont mind if anything complains ;) but unfortunately the entire musicgen / audiogen part is not working at all, in my docker install. TortoiseTTS is not working as well, i am unable to install the Bark Voice Cloning in Docker etc.

tts-generation-webui  | Traceback (most recent call last):
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
tts-generation-webui  |     output = await route_utils.call_process_api(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
tts-generation-webui  |     output = await app.get_blocks().process_api(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
tts-generation-webui  |     result = await self.call_function(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1185, in call_function
tts-generation-webui  |     prediction = await anyio.to_thread.run_sync(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
tts-generation-webui  |     return await get_asynclib().run_sync_in_worker_thread(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
tts-generation-webui  |     return await future
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
tts-generation-webui  |     result = context.run(func, *args)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/utils.py", line 661, in wrapper
tts-generation-webui  |     response = f(*args, **kwargs)
tts-generation-webui  |   File "/app/tts-generation-webui/src/musicgen/musicgen_tab.py", line 148, in generate
tts-generation-webui  |     MODEL = load_model(model)
tts-generation-webui  |   File "/app/tts-generation-webui/src/musicgen/musicgen_tab.py", line 127, in load_model
tts-generation-webui  |     return MusicGen.get_pretrained(version)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/audiocraft/models/musicgen.py", line 91, in get_pretrained
tts-generation-webui  |     return MusicGen(name, compression_model, lm)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/audiocraft/models/musicgen.py", line 52, in __init__
tts-generation-webui  |     super().__init__(name, compression_model, lm, max_duration)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/audiocraft/models/genmodel.py", line 55, in __init__
tts-generation-webui  |     self.compression_model = get_wrapped_compression_model(self.compression_model, self.cfg)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/audiocraft/models/builders.py", line 254, in get_wrapped_compression_model
tts-generation-webui  |     if cfg.interleave_stereo_codebooks.use:
tts-generation-webui  | AttributeError: 'NoneType' object has no attribute 'use'
rsxdalv commented 4 months ago

Thanks for the additional information. I have seen an user complain about the .use error, I think I have seen it myself. On a regular install it just resolved itself with a reinstall, but clearly that won't happen with a docker container. I'm assuming that it's the docker image made by the repo. Not sure how soon I can debug it, if I'm lucky this week or next week.

On Wed, Feb 14, 2024, 6:11 PM tilllt @.***> wrote:

Hey,

i dont mind if anything complains ;) but unfortunately the entire musicgen / audiogen part is not working at all, in my docker install. TortoiseTTS is not working as well, i am unable to install the Bark Voice Cloning in Docker etc.

tts-generation-webui | Traceback (most recent call last): tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction tts-generation-webui | output = await route_utils.call_process_api( tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api tts-generation-webui | output = await app.get_blocks().process_api( tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api tts-generation-webui | result = await self.call_function( tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1185, in call_function tts-generation-webui | prediction = await anyio.to_thread.run_sync( tts-generation-webui | File "/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync tts-generation-webui | return await get_asynclib().run_sync_in_worker_thread( tts-generation-webui | File "/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread tts-generation-webui | return await future tts-generation-webui | File "/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run tts-generation-webui | result = context.run(func, args) tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/utils.py", line 661, in wrapper tts-generation-webui | response = f(args, **kwargs) tts-generation-webui | File "/app/tts-generation-webui/src/musicgen/musicgen_tab.py", line 148, in generate tts-generation-webui | MODEL = load_model(model) tts-generation-webui | File "/app/tts-generation-webui/src/musicgen/musicgen_tab.py", line 127, in load_model tts-generation-webui | return MusicGen.get_pretrained(version) tts-generation-webui | File "/venv/lib/python3.10/site-packages/audiocraft/models/musicgen.py", line 91, in get_pretrained tts-generation-webui | return MusicGen(name, compression_model, lm) tts-generation-webui | File "/venv/lib/python3.10/site-packages/audiocraft/models/musicgen.py", line 52, in init tts-generation-webui | super().init(name, compression_model, lm, max_duration) tts-generation-webui | File "/venv/lib/python3.10/site-packages/audiocraft/models/genmodel.py", line 55, in init tts-generation-webui | self.compression_model = get_wrapped_compression_model(self.compression_model, self.cfg) tts-generation-webui | File "/venv/lib/python3.10/site-packages/audiocraft/models/builders.py", line 254, in get_wrapped_compression_model tts-generation-webui | if cfg.interleave_stereo_codebooks.use: tts-generation-webui | AttributeError: 'NoneType' object has no attribute 'use'

— Reply to this email directly, view it on GitHub https://github.com/rsxdalv/tts-generation-webui/issues/269#issuecomment-1943447182, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTRXI2HPNPM3C6WFQBGC6TYTSEVZAVCNFSM6AAAAABCO6LSRSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBTGQ2DOMJYGI . You are receiving this because you commented.Message ID: @.***>

rsxdalv commented 4 months ago

To give a tiny bit of context about this single .use error, it's because Audiocraft has installed with some version mismatch, so the dependencies fail expectations of the script.

On Wed, Feb 14, 2024, 6:42 PM Roberts Slisans @.***> wrote:

Thanks for the additional information. I have seen an user complain about the .use error, I think I have seen it myself. On a regular install it just resolved itself with a reinstall, but clearly that won't happen with a docker container. I'm assuming that it's the docker image made by the repo. Not sure how soon I can debug it, if I'm lucky this week or next week.

On Wed, Feb 14, 2024, 6:11 PM tilllt @.***> wrote:

Hey,

i dont mind if anything complains ;) but unfortunately the entire musicgen / audiogen part is not working at all, in my docker install. TortoiseTTS is not working as well, i am unable to install the Bark Voice Cloning in Docker etc.

tts-generation-webui | Traceback (most recent call last): tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction tts-generation-webui | output = await route_utils.call_process_api( tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api tts-generation-webui | output = await app.get_blocks().process_api( tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api tts-generation-webui | result = await self.call_function( tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1185, in call_function tts-generation-webui | prediction = await anyio.to_thread.run_sync( tts-generation-webui | File "/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync tts-generation-webui | return await get_asynclib().run_sync_in_worker_thread( tts-generation-webui | File "/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread tts-generation-webui | return await future tts-generation-webui | File "/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run tts-generation-webui | result = context.run(func, args) tts-generation-webui | File "/venv/lib/python3.10/site-packages/gradio/utils.py", line 661, in wrapper tts-generation-webui | response = f(args, **kwargs) tts-generation-webui | File "/app/tts-generation-webui/src/musicgen/musicgen_tab.py", line 148, in generate tts-generation-webui | MODEL = load_model(model) tts-generation-webui | File "/app/tts-generation-webui/src/musicgen/musicgen_tab.py", line 127, in load_model tts-generation-webui | return MusicGen.get_pretrained(version) tts-generation-webui | File "/venv/lib/python3.10/site-packages/audiocraft/models/musicgen.py", line 91, in get_pretrained tts-generation-webui | return MusicGen(name, compression_model, lm) tts-generation-webui | File "/venv/lib/python3.10/site-packages/audiocraft/models/musicgen.py", line 52, in init tts-generation-webui | super().init(name, compression_model, lm, max_duration) tts-generation-webui | File "/venv/lib/python3.10/site-packages/audiocraft/models/genmodel.py", line 55, in init tts-generation-webui | self.compression_model = get_wrapped_compression_model(self.compression_model, self.cfg) tts-generation-webui | File "/venv/lib/python3.10/site-packages/audiocraft/models/builders.py", line 254, in get_wrapped_compression_model tts-generation-webui | if cfg.interleave_stereo_codebooks.use: tts-generation-webui | AttributeError: 'NoneType' object has no attribute 'use'

— Reply to this email directly, view it on GitHub https://github.com/rsxdalv/tts-generation-webui/issues/269#issuecomment-1943447182, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTRXI2HPNPM3C6WFQBGC6TYTSEVZAVCNFSM6AAAAABCO6LSRSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBTGQ2DOMJYGI . You are receiving this because you commented.Message ID: @.***>

rsxdalv commented 4 months ago

Hey,

i dont mind if anything complains ;) but unfortunately the entire musicgen / audiogen part is not working at all, in my docker install. TortoiseTTS is not working as well, i am unable to install the Bark Voice Cloning in Docker etc.

tts-generation-webui  | Traceback (most recent call last):
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
tts-generation-webui  |     output = await route_utils.call_process_api(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
tts-generation-webui  |     output = await app.get_blocks().process_api(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
tts-generation-webui  |     result = await self.call_function(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1185, in call_function
tts-generation-webui  |     prediction = await anyio.to_thread.run_sync(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
tts-generation-webui  |     return await get_asynclib().run_sync_in_worker_thread(
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
tts-generation-webui  |     return await future
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
tts-generation-webui  |     result = context.run(func, *args)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/gradio/utils.py", line 661, in wrapper
tts-generation-webui  |     response = f(*args, **kwargs)
tts-generation-webui  |   File "/app/tts-generation-webui/src/musicgen/musicgen_tab.py", line 148, in generate
tts-generation-webui  |     MODEL = load_model(model)
tts-generation-webui  |   File "/app/tts-generation-webui/src/musicgen/musicgen_tab.py", line 127, in load_model
tts-generation-webui  |     return MusicGen.get_pretrained(version)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/audiocraft/models/musicgen.py", line 91, in get_pretrained
tts-generation-webui  |     return MusicGen(name, compression_model, lm)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/audiocraft/models/musicgen.py", line 52, in __init__
tts-generation-webui  |     super().__init__(name, compression_model, lm, max_duration)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/audiocraft/models/genmodel.py", line 55, in __init__
tts-generation-webui  |     self.compression_model = get_wrapped_compression_model(self.compression_model, self.cfg)
tts-generation-webui  |   File "/venv/lib/python3.10/site-packages/audiocraft/models/builders.py", line 254, in get_wrapped_compression_model
tts-generation-webui  |     if cfg.interleave_stereo_codebooks.use:
tts-generation-webui  | AttributeError: 'NoneType' object has no attribute 'use'

This issue has been resolved with the new docker image. Also I fixed the build process so the ghcr.io/rsxdalv/tts-generation-webui:main images are now up to date.

rsxdalv commented 4 months ago

Ok, I believe the issue should be resolved. If there's something that's not resolved or some concern, feel free to reopen the issue or creating an additional one.