Suggestion: Adding a tutorial for using your own models in the plugins

rjstange commented 3 years ago

I'm trying to dig around in the code, and I think I have the idea that I must convert a satisfactory checkpoint into one suitable for magenta.js. However, since I have no experience working in JS, what is the process of utilizing a model that is hosted locally in one of the plugins? My end goal is to use Drumify with a model trained on data of my own choosing so the output is more suitable to my style of music.

adarob commented 3 years ago

This would be a great contribution for someone to add! Here are the high-level instructions:

You'll first need to convert your TF python checkpoint to one compatible with TFJS using the instructions here: https://github.com/magenta/magenta-js/tree/master/music#your-own-checkpoints You'll then need to clone the magenta-studio repo and update the model URLs to point to your models. For example: https://github.com/magenta/magenta-studio/blob/aaf3282b139a6e76ce21c1d2ff2feb5358dcca28/continue/Model.js#L30 Finally, you can follow the instructions in the README to rebuild the apps: https://github.com/magenta/magenta-studio

Hope this helps!

rjstange commented 3 years ago

Thank you for your response, Adam! I tried digging around and do not see anywhere about the exact configurations of the models that are used by Drumify. I trained a Musical VAE model using a configuration that is based on "groovae_2bar_hits_control_tfds", except it has 16 bars as the maximum sequence length (as my data has a lot of performances of that length and I want to retain the complexity) without any split_bars, with tapify=True, and I created my own drum pitch class mapping to match the data, which is not as simplified as the Roland drum pitch classes, but also not as extensive as the full drum pitch classes.

I trained it on over 8000 midi files ranging from 2/4 bars, all the way to 16 bars, creating a note sequence of over 18MB. It went from a loss of 3000, all the way down to reaching almost 84 overnight. I tried to create samples with the music VAE generation script, but it gave me the "ValueError: Cannot feed value of shape () for Tensor 'Placeholder_2:0', which has shape '(?, 43)" error, which I saw elsewhere happens when you try to generate with a multi-track model in Musical VAE.

Will a model like this still work if I plug it into the Drumify plugin? Should I also train a few more models with maximum sequence lengths varying from 1, 2, 3, 4... etc, or will a single model that was trained on sequences of up to 16 bars work sufficiently?

rjstange commented 3 years ago

This is about as far as I think I can go based on this being my first foray into Node.js:

Tried building the plugins and there is no "dist" folder to be found anywhere. I hope this text file containing all the details of me doing npm install will help magenta-studio-build-attempt-log-dump.txt . Interestingly, I had to run the post-install commands for downloading the models and node modules for magenta4live.amxd separately after initially getting an error that the script "download:models;" could not be found.

I really hope to get this to work someday with my own data, as I appreciate what the plugin can do for me as a guitarist that wants to spend more time coming up with melodies and riffs instead of programming drum patterns or scouring through 8000 midi files to find one that is suitable for a particular part that I am playing.

rjstange commented 3 years ago

After many challenges, here's the furthest I've gotten to after getting my trained model into the plugin and building it, and attempting to use it in Ableton Live Suite.

Every other part of Magenta Studio works fine, as I am not changing any of the other models. This is with just stripping down to my best guess of the groovae_tap2drum_4bar model and disabling the other three models for testing. First I will show the config map for music_vae: CONFIG_MAP['groovae_4bar_tap_hits_as_controls_toontrack'] = Config( model=MusicVAE(lstm_models.BidirectionalLstmEncoder(), lstm_models.GrooveLstmDecoder()), hparams=merge_hparams( lstm_models.get_default_hparams(), HParams( batch_size=512, max_seq_len=16*4, # 4 bars w/ 16 steps per bar z_size=256, enc_rnn_size=[512], dec_rnn_size=[256, 256], max_beta=0.2, free_bits=48, dropout_keep_prob=0.3, )), note_sequence_augmenter=None, data_converter=data.GrooveConverter( split_bars=4, steps_per_quarter=4, quarters_per_bar=4, max_tensors_per_notesequence=20, tapify=True, hits_as_controls=True, pitch_classes=data.TOONTRACK_REDUCED_DRUM_PITCH_CLASSES, inference_pitch_classes=data.REDUCED_DRUM_PITCH_CLASSES), tfds_name='groove/2bar-midionly' )

TOONTRACK_REDUCED_DRUM_PITCH_CLASSES is simply an expanded copy of ROLAND_DRUM_PITCH_CLASSES, to work with my dataset. The neural network is still only reading 9 drums from the 59 different hits. I also use this config.json file in the model's folder:

{ "type": "MusicVAE", "dataConverter": { "type": "GrooveConverter", "args": { "numSteps": 64, "tapify": true, "splitInstruments": false } } }

After building the plugin in Max 8, and trying to run it in Ableton Live Suite 11, I get this error message.

Error in matMul: inner shapes (539) and (548) of Tensors with shapes 1,539 and 548,2048 and transposeA=false and transposeB=false must match.

Am I configuring the network wrong prior to training? I got the same error when I tried to train with the full 59 separate drum hits, getting a slightly larger checkpoint, so I tried to go back to what was done originally. It would be a great help if I could know the exact configuration that was used for the Drumify and Groove models, and even better for what could be done to allow for Drumify to utilize a more advanced network that is trained to recognize all 59 different hits, and generate them during inference.

Anyone's help would be greatly appreciated! This project means a lot to me.

jrgillick commented 2 years ago

Hey Riley, cool to hear that you're working on this! A few things:

I think you don't want to use the "hits_as_controls" option. Try setting that to False. That option was used during some experiments for what ended up becoming the "Groove" plugin, but it was never used in Magenta Studio. Setting that option causes the model to expect an additional input vector of length 9 (b.c. we used 9 drums) to be appended at each timestep - that would explain why your shapes don't match.
Training a model using anything other than the default 9 drum categories might require changing code in a few places. The sizes of the tensors defined in the Tensorflow model depend on that number 9, so if you want to change that, you'll need to make sure everything still works in both python and js. It might be fine but could take some debugging. I'd recommend trying to get a version working with just the default 9 drums before giving this a shot.
We designed the Drumify models with loops in mind as opposed to long sections, so I'm not sure whether you'll see a benefit to training on 16 bars - maybe, depending on your data, but you also might overfit more. I think the JS code in magenta studio was written to handle up to 4 bars at a time and has some logic for dividing up longer inputs and choosing which model to use on which part. If you want to change that, you'd need to find where that logic is and update it to do what you want.
You might want to make sure your models are working in python first before testing in JS. This Colab would be a good reference for that (although I don't know if Colab's environment still supports this version of Tensorflow): https://colab.research.google.com/github/tensorflow/magenta-demos/blob/master/colab-notebooks/GrooVAE.ipynb#scrollTo=au0HjIkZuBW4. I never used the command line version of the MusicVAE generation script for this model, which was developed later on, so I would suggest following the Colab.

Hope that helps!

Jon

rjstange commented 2 years ago

@jrgillick , thank you so much for your help. I was able to get my models working with the plugin. However, here are some issues I've encountered during the process of attempting to follow the instructions Adam provided:

The instructions should state that only Node version 13.x is supported since node-sass failed to install until I used Node of that version number. I decided to use 13.6.0 since it was the latest version during the last human commit.
I would suggest looking into the automated dependabot commits. The version jump for electron from version 2.0.11 to 7.2.4 caused an error whenever I opened any of the different plugins (Drumify, Groove, Generate, etcetera). The error message is as follows: "A Javascript error occurred in the main process Uncaught Exception: TypeError: Invalid template for MenuItem: must have at least one of label, role, or type at Function../lib/browser/api/menus.js.Menu.buildFromTemplate(electron/js2c/browser_init.js:1918:11) at App. (C:\Users\rjstr\AppData\Local\Temp\MaxPlug64\magenta4live.amdx_coll-u102000143...:295) at App.emit (events.js:208:15)" Thankfully this error appears to be a red herring and does nothing to limit the functionality of each app. I resolved this by using the package.json and package_lock.json files from the last human commit from Adam.
To do any of the Node-based building of the app, I had to use WSL (Windows Subsystem for Linux) from Windows, using Ubuntu 20.04 LTS. When I attempted to use Node on Windows, nothing would happen when I ran the "npm run build windows-ableton" command. I can also confirm that everything is working great using WSL2 on Windows 11, and also I used Tensorflow 2.4 on Windows with CUDA 11 and CUDNN 8. I had to try that version bump since I have an Ampere GPU, and trying to use anything below CUDA 11 causes problems like having it hang for 10-20 minutes before even starting training.
When I attempted to convert the checkpoint to TFJS format, I had to modify the code of the conversion script to remove anything related to QUANTIZATION_BYTES_TO_DTYPES for the script to run. Otherwise, I would get an error about an unexpected argument passed into '--quantization_bytes.'
Finally, I attempted a second experiment to train the same four sets of models (1, 2, 3, and 4 bar). However, this time I had the model recognize all 50+ types of drum hits as distinct, instead of reducing their dimensionality down to only 9 equivalent hits. The models successfully trained, starting from losses ranging from ~1800 for the 1-bar model, all to way up to a loss of over 3800 for the 4-bar model. All of them reached their lowest loss around 100 steps before starting to go up and fluctuate. I also could generate samples from the models. However, they would not work in Drumify. I would like to know where the code needs to be modified for this to work, as I have data with a lot of variety that the models need to express.

I am pleased with the results of what the model can do at the moment. I hope my journey will be helpful to continue the development of this project and help evolve it into something that will find more and more use in the workflow of many musicians. Thank you all for your continued help!

jrgillick commented 2 years ago

Hey Riley, glad to hear you got your models working with the plugin, and thanks for the installation tips.

Re: your point 5, if your lowest loss happens after only 100 training steps, your data with 50 drum categories might be pretty hard to model. If the generated drums sound good to you, then maybe it's fine - but you're taking on a more difficult machine learning problem, so I wouldn't expect it to be a simple fix. It might be more like a new research project to figure out how to get a good model of your data set.
In terms of getting your model to work in Drumify, if you made any changes to the data converter in python, I guess you'll need to make corresponding changes in the JS code, and also check on your mapping between model outputs and MIDI notes. But before investing the time to go down this path, you might want to do a sanity check in python/Colab (e.g. by generating some MIDI clips and manually bringing them into Ableton to listen) to see whether your new models are working the way you want.

magenta / magenta-studio

Suggestion: Adding a tutorial for using your own models in the plugins #54