Harmonai-org / sample-generator

Tools to train a generative model on arbitrary audio samples
MIT License
1.08k stars 174 forks source link

Where to find the model for the finetune notebook and can we use other models from huggingface spaces? #20

Open ysig opened 1 year ago

ysig commented 1 year ago

For the finetuning notebook:

  1. I don't find from where I need to download jmann-small-190k.ckpt from and
  2. as jmann-small-190k.ckpt gives poor quality outputs on hugging face spaces I was curious whether I could use jmann-large-580k directly through the same code. (assuming it's the .bin file).

Thank you so much for releasing this research effort online!

twobob commented 1 year ago

They are ckpt files generally the format is: https://model-server.zqevans2.workers.dev/ABBREVIATION-STEPSk.ckpt

like https://model-server.zqevans2.workers.dev/gwf-440k.ckpt

yes you can use the other ckpt files for training. Including ones from larger sized models using the same architecture

Harushii18 commented 1 year ago

They are ckpt files generally the format is: https://model-server.zqevans2.workers.dev/ABBREVIATION-STEPSk.ckpt

like https://model-server.zqevans2.workers.dev/gwf-440k.ckpt

yes you can use the other ckpt files for training. Including ones from larger sized models using the same architecture

So we can train our own ckpt files? I wanted to know if it's possible to share the training file too, not just finetuning And what specs of machine did you have to train on in order to get the results you did?

twobob commented 1 year ago

Me? I trained on colab free, kaggle and one day on an a100 for the really huge sets.

Harushii18 commented 1 year ago

Me? I trained on colab free, kaggle and one day on an a100 for the really huge sets.

Is there a colab file with the training file (not the fine tuning one) that was used to create the models for the glitch etc model? Or will that not be released?

twobob commented 1 year ago

You would have to ask Zach. The glitch model is trained by (a bunch of people doing it on their own AFAIK) a set of individual. Basially they started from scratch AFAIK and just trained a bunch of glitchy noises. Not sure what you are after.