gitmylo / audio-webui

A webui for different audio related Neural Networks
MIT License
964 stars 90 forks source link

[QUESTION] Local musicgen models go where? #186

Open kronkinatorix opened 7 months ago

kronkinatorix commented 7 months ago

Wanting to use this with some large models that don't appear to be coded into audiocraft.py - was able to download and install and use the large model via audiocraft.py but I'm not sure where the webui puts them?

I'd also like to use facebook's newer stereo models but seem to be having an issue loading them. I'm guessing this may be due to incompatible format / coding - would they be throwing up errors if I tried to load them through the webui?

Lastly, it would be great to have an option not use huggingface for downloads/storage for models. A 'models dir' would be amazing. I've run into this a few times - when it works, great, but when you don't have an internet connection / you've spent hours meticulously organizing your models locally, having to go through the rigmarole of editing code to have a script point directly to your local files is a headache. That and I frankly have 0 faith in the longevity of any file hosted online.

Thanks for all your hard work!

Quick update - I'm getting errors trying to load any of the melody models:

Traceback (most recent call last): File "/mnt/models/Audio/audio-webui/webui/modules/implementations/audiocraft.py", line 26, in create_model model = MusicGen.get_pretrained(pretrained, device=map_device) if pretrained not in audiogen_models else AudioGen.get_pretrained(pretrained, device=map_device) File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/audiocraft/models/musicgen.py", line 126, in get_pretrained lm = load_lm_model(name, device=device) File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/audiocraft/models/loaders.py", line 114, in load_lm_model model = builders.get_lm_model(cfg) File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/audiocraft/models/builders.py", line 97, in get_lm_model condition_provider = get_conditioner_provider(kwargs["dim"], cfg).to(cfg.device) File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/audiocraft/models/builders.py", line 141, in get_conditioner_provider conditioners[str(cond)] = ChromaStemConditioner( File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/audiocraft/modules/conditioners.py", line 546, in init self.chroma_len = self._get_chroma_len() File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/audiocraft/modules/conditioners.py", line 596, in _get_chroma_len dummy_chr = self.chroma(dummy_wav) File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/audiocraft/modules/chroma.py", line 56, in forward spec = self.spec(wav).squeeze(1) File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, **kwargs) File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/torchaudio/transforms/_transforms.py", line 110, in forward return F.spectrogram( File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/torchaudio/functional/functional.py", line 126, in spectrogram spec_f = torch.stft( File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/torch/functional.py", line 641, in stft return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/gradio/routes.py", line 437, in run_predict output = await app.get_blocks().process_api( File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1352, in process_api result = await self.call_function( File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1077, in call_function prediction = await anyio.to_thread.run_sync( File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "/mnt/models/Audio/audio-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, *args) File "/mnt/models/Audio/audio-webui/webui/ui/tabs/audiocraft.py", line 23, in load_model acrft.create_model(model) File "/mnt/models/Audio/audio-webui/webui/modules/implementations/audiocraft.py", line 31, in create_model raise gradio.Error('Could not load model!') gradio.exceptions.Error: 'Could not load model!'

Think it's due to a lack of forward compatability for cuda 11.7 on the rtx 4090 as discussed here (was using cuda 11.8).

Going to give 11.7 a shot.

And for prosperity looks like everything is stored in the huggingface cache and I'm a dumbass.

torch==2.0.1 torchvision torchaudio

https://github.com/neonbjb/tortoise-tts/discussions/597

kronkinatorix commented 7 months ago

Reinstalled with a frensh venv and yeah, looks like melody for audiocraft is broken.

Seems like a conflict with hydra-core. Audiocraft requires 1.1 installing 1.1 means fairseq and omegaconf's depends aren't satisfied.

phr00t commented 6 months ago

@kronkinatorix did you ever find a solution for using the newer, larger stereo melody models? I couldn't get it working in audio-webui either...