Natooz / MidiTok

MIDI / symbolic music tokenizers for Deep Learning models 🎶
https://miditok.readthedocs.io/
MIT License
698 stars 84 forks source link

Using pitch bend in Octuple tokenizer #196

Closed Neptune-S-777 closed 2 months ago

Neptune-S-777 commented 2 months ago

Hi, thank you for your amazing work! I was wondering if there is a way to add the pitch bend attribute to the octuple tokenizer. Is it necessary to create a new tokenizer for this, or is there an existing variable for it?

Natooz commented 2 months ago

Hi, unfortunately I don’t think that pitch bends can be integrated in Octuple. Octuple works by having one token per note. While we can embed tempo and time signatures into the mix as they are « global » features, pitch bends aren’t and would break the « continuity ». Maybe they could be integrated into MuMIDI. I actually recommend to use oser tokenizers (REMI, TSD, MIDILike) to get all the additional tokens that you want + training the tokenizer with BPE/Unigram to get the best results

Neptune-S-777 commented 2 months ago

Cool, thanks for your recommendation! I think the best option for me is to encode with REMI and organize the attributes in the octuple token format.