Natooz / MidiTok

MIDI / symbolic music tokenizers for Deep Learning models 🎶
https://miditok.readthedocs.io/
MIT License
692 stars 82 forks source link

Trailing Bars are not appended into REMI token sequence if last note is multibar length #200

Closed Mintas closed 1 week ago

Mintas commented 1 month ago

Trailing Bars are not appended into REMI token sequence if last note is multibar length.

example score has 80 real bars, but only 78 Bar tokens: image

Last Bar in resulting tokenization for example: ['TimeSig_4/4', 'Position_24', 'Program_29', 'Pitch_52', 'Velocity_76', 'Duration_8.0.24', 'Pitch_59', 'Velocity_76', 'Duration_8.0.24', 'Pitch_64', 'Velocity_76', 'Duration_8.0.24', 'Program_33', 'Pitch_28', 'Velocity_76', 'Duration8.0.24', 'Program-1', 'PitchDrum_36', 'Velocity_76', 'Duration_0.3.24', 'PitchDrum_38', 'Velocity_76', 'Duration_0.3.24', 'PitchDrum_52', 'Velocity_76', 'Duration_0.12.24']

Therefore, I assume it would be nice to have an option like 'add_trailing_bars' that will add corresponding 'Bar_None','TimeSig_4/4','Bar_None','TimeSig_4/4' into the end of the sequence. Thus making both sequences of real bar count and produced bar token to have same length.

This may not always be the case, that one really needs those bars, but their presence can help in some applications like detecting anacrusis (pickup measures) and possible problems with their interpretation as Bar or position shift .

Natooz commented 1 month ago

Thank you for the suggestion! Indeed, this could be a nice addition for bar-based tokenizers. I don't think this should be hard to implement. Would you be up to do it, or should we add it to the backlog?

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 30 days with no activity.

Mintas commented 3 weeks ago

@Natooz Yes, i think I would implement this feature, gonna dig through https://github.com/Natooz/MidiTok/blob/main/CONTRIBUTING.md =) It is really not hard to implement. The real question is whether this option should be implemented for non-REMI tokenizers or not?

Natooz commented 3 weeks ago

Great! I don't think it's necessary to implement it for others, as they handle bars / times and notes altogether already so there is no real way to add trailing bars

Mintas commented 1 week ago

Closing as implemented in https://github.com/Natooz/MidiTok/pull/204