gitmylo / audio-webui

A webui for different audio related Neural Networks
MIT License
1.03k stars 99 forks source link

Model Management #226

Open d8ahazard opened 5 months ago

d8ahazard commented 5 months ago

Add system for universal management of model downloads, versus relying on the HF cache.

d8ahazard commented 5 months ago

I believe it's a better idea to use a callback instead of the all_models.json file. As it would be an easier way for extensions to add new models. (And it would allow associated models to be referenced inside of the related files). If you want, I can write the documentation.

Optionally support for an object containing the model repo, name, etc. Like you currently have in the json. As it would prevent typos between the download list and load functions. Because there would only be one instance of the repository name and file name.

And last, a new load_codec_model function is added in bark_custom_voices.py. But you could instead use the implementation from webui/modules/implementations/patches/bark_generation.py.

Besides that, the PR looks great, and once some changes are made to fix the mentioned issues I'll merge it. If you don't feel like making the changes, just tell me, and I'll see if I can work on them myself. Although that might not be that soon. If you still have any questions, (for example, about the callbacks, although the docs might help with that already) feel free to ask.

OK, I've used the inbuilt function for the bark loader.

The json file was intended to be a record of all the models specifically used by the app and for internal purposes (updates, new model adds, etc). My thought for extensions was the same as yours - add a callback that extensions can use to register other models to download on install/load?

d8ahazard commented 5 months ago

There's one more change that I forgot to mention, my apologies. I don't think the default models should contain RVC models. I'd rather keep that part up to the user. Can you remove the RVC models? Or just give an okay that I can remove them?

Also, another question I had, what exactly are the || used for inside the default models list? I can't find anywhere where you split a url on those characters, and looking at the code, I'm a little confused about it. What is it for, exactly? When the filename is already specified in the single_file_name field. (although, now that I look at it, some have a single_file_name set, and others don't?)

What if we make the RVC downloads optional? For my own selfish purposes, it would still be useful, as I prefer the "kitchen sink" approach, and I know from working on Automatic1111 that users really like some kind of "offline' mode for an app.

The || in the default models list is, I think, my way of ensuring that the keys are unique, as I think some models have multiple versions and stuff? To download them all, we don't really need to split the key, as everything is in each object.

gitmylo commented 5 months ago

What if we make the RVC downloads optional? For my own selfish purposes, it would still be useful, as I prefer the "kitchen sink" approach, and I know from working on Automatic1111 that users really like some kind of "offline' mode for an app.

With RVC requiring a different model for each voice, it doesn't seem like a good idea. People want to use RVC with the models they already have, they don't need models to be downloaded. It's kind of like having a text editor that comes preloaded with text. You probably don't want that. The RVC downloading part still requires an internet connection during the initial setup, so I would think it's a better idea to just let users install the models manually.

The || in the default models list is, I think, my way of ensuring that the keys are unique, as I think some models have multiple versions and stuff? To download them all, we don't really need to split the key, as everything is in each object.

I was more confused about if it'll still work correctly, as looking at the code, it appears to use the url including the ||. As it would change the URL. I can't find any instance where the URL is split on the ||. What I see is this:

  1. download_all_models() is called
  2. default_models.json is read, and loaded from json into a dict
  3. every key is given to the get_model_path function as the model_url and the values are given as keyword args
  4. now inside of the function, model_url is checked to see if the download is a huggingface or direct download. But model_url is never overwritten.
  5. depending on if the url is a huggingface or direct download, the corresponding download function is called with the model_url as the url.

Now, if I had an entry with the URL suno/bark||text_2.pt, it would go through all the steps unchanged. Now on step 5, it uses suno/bark||text_2.pt as the URL for the call to hf_hub_download() call.

Is there something I'm missing? Because this doesn't seem correct. I think || will affect the URL, and it will now try downloading from the wrong URL.

d8ahazard commented 5 months ago

What if we make the RVC downloads optional? For my own selfish purposes, it would still be useful, as I prefer the "kitchen sink" approach, and I know from working on Automatic1111 that users really like some kind of "offline' mode for an app.

With RVC requiring a different model for each voice, it doesn't seem like a good idea. People want to use RVC with the models they already have, they don't need models to be downloaded. It's kind of like having a text editor that comes preloaded with text. You probably don't want that. The RVC downloading part still requires an internet connection during the initial setup, so I would think it's a better idea to just let users install the models manually.

The || in the default models list is, I think, my way of ensuring that the keys are unique, as I think some models have multiple versions and stuff? To download them all, we don't really need to split the key, as everything is in each object.

I was more confused about if it'll still work correctly, as looking at the code, it appears to use the url including the ||. As it would change the URL. I can't find any instance where the URL is split on the ||. What I see is this:

  1. download_all_models() is called
  2. default_models.json is read, and loaded from json into a dict
  3. every key is given to the get_model_path function as the model_url and the values are given as keyword args
  4. now inside of the function, model_url is checked to see if the download is a huggingface or direct download. But model_url is never overwritten.
  5. depending on if the url is a huggingface or direct download, the corresponding download function is called with the model_url as the url.

Now, if I had an entry with the URL suno/bark||text_2.pt, it would go through all the steps unchanged. Now on step 5, it uses suno/bark||text_2.pt as the URL for the call to hf_hub_download() call.

Is there something I'm missing? Because this doesn't seem correct. I think || will affect the URL, and it will now try downloading from the wrong URL.

I'm a bit swamped the next few days, so IDK if I'll have time to implement the above - but I think you may be correct. I may have overlooked putting all the required info in the json. Feel free to revise however you like, including the RVC stuff. Those aren't the models that were giving me headaches anyway, and if the user wants to download them later, I'm good with that.

gitmylo commented 5 months ago

Alright, and don't worry, no rush.