How do I train my own MIDI

dedededefo commented 2 years ago

How to preprocess MIDI data

netpi commented 2 years ago

@dedededefo FYI https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

dedededefo commented 2 years ago

@dedededefo 仅供参考https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

Thank you. I got it. At the same time, I want to ask you how to train a good model. I found that the results I got after training several times were not satisfactory

dedededefo commented 2 years ago

@dedededefo 仅供参考https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

Do I need to adjust some parameters

netpi commented 2 years ago

@dedededefo

Music STEP28500 generated are kind to me but STEP70000's generations are better for some others. (Someone said Music STEP70000 generated are better)

You may adjust the lr and step based on the training evaluations tf-logs

dedededefo commented 2 years ago

Thank you for your reply! Thank you.

dedededefo commented 2 years ago

@dedededefo 仅供参考https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

Sorry to bother you again. It seems that the data set obtained by compound word transformer processing data cannot be trained on your model

I'm just getting started with in-depth learning. Do I need to do further processing to get a JSON format data set suitable for input to your model

It would be better if you could produce a conversion of data sets in two formats, because I seem to find that the tokens of compound word transformer are 7, while you are 9

netpi commented 2 years ago

@dedededefo 仅供参考https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

Sorry to bother you again. It seems that the data set obtained by compound word transformer processing data cannot be trained on your model

I'm just getting started with in-depth learning. Do I need to do further processing to get a JSON format data set suitable for input to your model

It would be better if you could produce a conversion of data sets in two formats, because I seem to find that the tokens of compound word transformer are 7, while you are 9

The Official implementation has no Rest so it's 7

But, I did it with Rest so it's 8.

you can look into the tokenizer here

dedededefo commented 2 years ago

@dedededefo参考翻译https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

很抱歉再次打扰您。看来复合词转换器处理数据得到的数据集无法在你的模型上训练我才刚刚开始深入学习。我是否需要进一步处理以获得适合输入到您的模型的 JSON 格式数据集如果你能产生两种格式的数据集的转换会更好，因为我似乎发现复合词转换器的标记是7，而你是9

官方实现没有Rest所以是7

但是，我这样做了，Rest所以它是 8。

你可以在这里查看标记器

Thank you, teacher!!! and thank you for your patience! Thank you for your excellent work ！/(ㄒoㄒ)/~~

dedededefo commented 2 years ago

@dedededefo参考翻译https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

很抱歉再次打扰您。看来复合词转换器处理数据得到的数据集无法在你的模型上训练我才刚刚开始深入学习。我是否需要进一步处理以获得适合输入到您的模型的 JSON 格式数据集如果你能产生两种格式的数据集的转换会更好，因为我似乎发现复合词转换器的标记是7，而你是9

官方实现没有Rest所以是7

但是，我这样做了，Rest所以它是 8。

你可以在这里查看标记器

Hello! I encountered the following problems when generating MIDI: I tried to modify the file in miditok, but found that the error could not be completely solved

And I found your local demo Ipynb has a similar error

dedededefo commented 2 years ago

Hi！ I went to miditok to reflect the error. The author's reply is shown in the figure above. But I'm curious about why your pre training model ckpt28500 can generate midi. When I changed to miditok=1.2.6, I found that the extracted tokens have changed

netpi commented 2 years ago

@dedededefo I think that a well-trained model can generate a midi without error. A not well-trained model may generate some wrong token sequence that can't be decoded by Miditok.

dedededefo commented 2 years ago

@dedededefo 我认为一个训练有素的模型可以生成一个没有错误的 midi。没有经过良好训练的模型可能会生成一些 Miditok 无法解码的错误标记序列。

Thank you for your reply and patience！That is, how can I judge whether my training is effective! Is it just to see whether MIDI can be generated? I also think the model I trained is not good. This is the first transformer model I tried to train!

netpi commented 2 years ago

@dedededefo 我认为一个训练有素的模型可以生成一个没有错误的 midi。没有经过良好训练的模型可能会生成一些 Miditok 无法解码的错误标记序列。

Thank you for your reply and patience！That is, how can I judge whether my training is effective! Is it just to see whether MIDI can be generated? I also think the model I trained is not good. This is the first transformer model I tried to train!

you can find the wrong tokens and mask them when generating like this.

dedededefo commented 2 years ago

@dedededefo 我认为一个训练有素的模型可以生成一个没有错误的 midi。没有经过良好训练的模型可能会生成一些 Miditok 无法解码的错误标记序列。

Thank you for your reply and patience！That is, how can I judge whether my training is effective! Is it just to see whether MIDI can be generated? I also think the model I trained is not good. This is the first transformer model I tried to train!

you can find the wrong tokens and mask them when generating like this.

Hello! The model I got after setting the training epoch to 5 can generate MIDI, which I feel is related to accuracy and loss! Thank you for your reply and help

dedededefo commented 2 years ago

@dedededefo我认为一个训练有素的可以生成一个没有错误的midi。

感谢您的回复和耐心等待！即如何判断我的训练是否有效！就是看能不能生成MIDI吗？我也觉得我训练的模型不好。这是我尝试训练的第一个变压器模型！

像这样生成时，您可以找到错误的标记并屏蔽它们。

你好！我把training epoch设置为5后得到的模型可以生成MIDI，感觉和accuracy和loss有关！感谢您的回复和帮助

But the quality of production is not very good! hahaha

netpi commented 2 years ago

@dedededefo

Congratulations! You did it.

I'll close this issue, you can open a new one if needed.

netpi / compound-word-transformer-tensorflow

How do I train my own MIDI #2