teticio / audio-diffusion

Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.
GNU General Public License v3.0
707 stars 69 forks source link

Training own music samples? #33

Closed timtensor closed 1 year ago

timtensor commented 1 year ago

Hi , first of all , loved your remxes and loops that you had shared and great work . I was curious to train a model on my own music samples , however I am trying to follow similar methodology as you as described in the medium article .

So far i have downloaded 30 s samples ( from the preview url link provided by Spotify) , stored as /content/playlist and also spectrograms of downloaded samples using librosa under /content/playlist_spectrograms but i think it is not as you had mentioned .

I am wondering if the next steps are to be follow the following notebook https://github.com/teticio/audio-diffusion/blob/main/notebooks/train_model.ipynb ? (This produces a model right ? )

then to use following model ? https://github.com/teticio/audio-diffusion/blob/main/notebooks/audio_diffusion_pipeline.ipynb

I am wondering if this is the right way to move forward ? Or if there are some steps that i am missing ?

Regards Tim

teticio commented 1 year ago

Yes, you can do this. But you son't need to create the spectrograms yourself - in fact it is better if you don't. If you follow the steps in the train_model.ipynb notebook, you will see that it runs a script audio_to_images.py, before running the training script. You can generate samples in the same notebook, or load a previously trained model in the audio_diffusion_pipeline.ipynb notebook