Closed MikeMpapa closed 5 months ago
Hi, thank you for these nice words.
The tokenizer.add_to_vocab
method should be what you are looking for. It allows to add custom tokens to the vocabulary. The tokenizer implemented in MidiTok will however not use them, this is up to you to add them at the appropriate indexes in the token sequences produced.
Alternatively, you can also subclass one of the tokenizer class and override the required methods to potentially inject your custom tokens more easily at the right places.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
Hi and thanks for the awesome work! If I want to modify a tokenization scheme and add custom tokens to the vocabulary related to MIDI metadata. How could I do that?