Closed adricl closed 7 months ago
The compile and torch.set_float32_matmul_precision("high") has halved the time each epoch takes for me on a 4090.
THIS IS EPIC! I'm happy you found it useful. I'll review your changes and test them out myself soon (sorry, I'm a little busy over the next few days, but I'll aim for next weekend to be done), but am almost certain that I'll merge this PR. Thanks so much for your work.
Oh right, to address your question: I found a sequence length of 2000 worked well for me. HOWEVER, that was due to the memory constraints on the GPU I had available (I think it was a T100). Ideally, your sequence length should be as long as possible. Pieces of music have such widely varying lengths, and you want to be able to handle both the starting, development, and ending sections of the music. For that, you need really long sequence lengths. However, longer sequence lengths means that there is more data per batch, meaning you need higher RAM in your GPU. Or you could compromise by lowering batch size, causing training to be slower.
I have not experimented with better GPUs that may have allowed longer sequence lengths, so I do not quite know how to strike the best balance between efficient training and length of the music that is trained on. Nevertheless, any sequence length you settle on will be fine. The model will learn to write music, although it may not learn how to start and end a song.
Have a look at what I have done when you have a chance but I will be pushing a new pr to fix a bug here. https://github.com/spectraldoy/music-transformer/blob/d27cfda3dd980a827c7460bd23527fd3b9e1cc68/preprocessing.py#L164 The dataset I am using is bigger than my GPUs memory and keeps crashing here so I am looking to batch it up.
If you are interested I am using this dataset. https://github.com/lucasnfe/adl-piano-midi
Sounds great, I really appreciate your hard work on making this repo better. Thanks for the link to that dataset, if I had that when I was initially developing this project it would have been so useful.
Also to get around GPU RAM constraints, you may consider using the -t
(number of transpositions) and -s
(number of time stretches) flags on python3 preprocessing.py
if you're using the scripts. By default it performs 5 transpositions and 3 time stretches in order to augment the data, but you can make these 3 and 2 respectively for much less data. You could also use -bs
to change the batch size for python3 train.py
.
It will be a few days before I can address these concerns.
I see, that's alright. As long as you can clean up that one Exceptions list, I can do the rest. If you ever get time, just let me know here what the reduced Exception list should be
This is awesome! Thanks for your hard work, and looking forward to your next PR!
Hello @spectraldoy, I have really liked this well thought out project. Its just what I need to muck around and learn about transformers and to produce some really interesting midi files.
I was wondering the paper mentioned that you use samples sizes of 10 ms. In your pre-processor code it requires a sequence length and I was wondering what length you found best for this?
These are some of the changes I have made to you code:
Updates to get Pytorch 2.1 working and speed ups with Pytorch 2. I have also added glob for finding midi files and changed the logging a bit. I have also added a feature so you can give the model a midi file and it can generate the music after the midi file ends.
I also modified how the Multiheadattention works to be compatible with the PyTorch transformer module in PyTorch 2. Since these changes have been made it breaks the models you have included with this project, so you might not want to merge this pr. If you want I could look at migrating the models over to how the new model expects them.