Open adrienchaton opened 4 years ago
Hello, try this checkpoint which was trained for 500,000 iterations on the MAESTRO dataset.
I haven't tried fine-tuning on these models, but it should be theoretically possible. I expect that it'd require a careful hyperparameter search (epochs, learning rate and betas of Adam, etc.)
Regarding velocity, I believe it's the same as the MIDI scale; just that most MIDI files in the dataset don't have too high velocity. AFAIK the original paper tried some heuristic for normalizing the velocity but I didn't reimplement that. You can check out MIDI preprocessing code here.
Thank you very much. Checkpoint is helpful as I want to compare retraining from scratch or fine-tuning from the maestro checkpoint ! And indeed, your midi code seems to follow the standard MIDI range. Your data loader is nicely done too, I could easily adapt it to another dataset based on the MAPS class.
Hello, try this checkpoint which was trained for 500,000 iterations on the MAESTRO dataset.
I haven't tried fine-tuning on these models, but it should be theoretically possible. I expect that it'd require a careful hyperparameter search (epochs, learning rate and betas of Adam, etc.)
Regarding velocity, I believe it's the same as the MIDI scale; just that most MIDI files in the dataset don't have too high velocity. AFAIK the original paper tried some heuristic for normalizing the velocity but I didn't reimplement that. You can check out MIDI preprocessing code here.
Hi Jongwook, May i know to use this checkpoint, is there any requirement on the torch version? I ran into issue when trying it out. torch.nn.modules.module.ModuleAttributeError: 'LSTM' object has no attribute '_flat_weights'
My torch version is 1.7.1
Thanks
I am running the codes with torch==1.3.0 There would be some (minor) things to update for running on higher versions.
I am running the codes with torch==1.3.0 There would be some (minor) things to update for running on higher versions.
Thanks. I think torch==1.3.0 is not available in python3.7. I got much more error with python3.6 with torch==1.3.0 May i know what is the minor things that needs to update for running on higher version?
You'd have to update error by error, but it should not be a lot since Pytorch has not changed much in the basic functions that are used here. One more complicate issue you may have is to load the pretrained model weights in the updated class ..
I could install both 1.2.0 and 1.3.0 for python 3.7.9 ; I think pip still let installing older versions or you'd have to compile it yourself.
You'd have to update error by error, but it should not be a lot since Pytorch has not changed much in the basic functions that are used here. One more complicate issue you may have is to load the pretrained model weights in the updated class ..
I could install both 1.2.0 and 1.3.0 for python 3.7.9 ; I think pip still let installing older versions or you'd have to compile it yourself.
noted. Thanks for the info!
have you come across this error while loading the MAPS dataset during evaluation?
RuntimeError: data/MAPS/flac/MAPS_MUS-bk_xmas1_ENSTDkAm.pt is a zip archive (did you mean to use torch.jit.load()?)
no, I did not use either MAESTRO or MAPS
I wanted to fine-tune on a custom dataset so I made a class similar to MAPS(PianoRollAudioDataset) which reads flac audio and tsv annotations put in the same format as the MAPS dataset.
tsv annotations are in the column format 'onset,offset,note,velocity' so it's easy to convert any dataset you'd get to train with that dataset class
if you cannot load the prepared MAPS data (which I did not use), you could download it from elsewhere, format it as flac/tsv and load it with the dataset class
Thank @adrienchaton .
@jongwook Could you provide the checkpoint after 1 million steps as well?
the metrics of this model is not very good.
note precision : 0.809 ± 0.111 note recall : 0.760 ± 0.110 note f1 : 0.782 ± 0.106 note overlap : 0.554 ± 0.105 note-with-offsets precision : 0.378 ± 0.135 note-with-offsets recall : 0.356 ± 0.133 note-with-offsets f1 : 0.366 ± 0.133 note-with-offsets overlap : 0.817 ± 0.084 note-with-velocity precision : 0.738 ± 0.112 note-with-velocity recall : 0.694 ± 0.113 note-with-velocity f1 : 0.714 ± 0.109 note-with-velocity overlap : 0.557 ± 0.107 note-with-offsets-and-velocity precision : 0.350 ± 0.130 note-with-offsets-and-velocity recall : 0.329 ± 0.129 note-with-offsets-and-velocity f1 : 0.339 ± 0.129 note-with-offsets-and-velocity overlap : 0.816 ± 0.084 frame f1 : 0.651 ± 0.112 frame precision : 0.639 ± 0.166 frame recall : 0.694 ± 0.087 frame accuracy : 0.492 ± 0.121 frame substitution_error : 0.102 ± 0.052 frame miss_error : 0.204 ± 0.085 frame false_alarm_error : 0.393 ± 0.386 frame total_error : 0.698 ± 0.386 frame chroma_precision : 0.673 ± 0.162 frame chroma_recall : 0.735 ± 0.083 frame chroma_accuracy : 0.532 ± 0.112 frame chroma_substitution_error: 0.061 ± 0.029 frame chroma_miss_error : 0.204 ± 0.085 frame chroma_false_alarm_error : 0.393 ± 0.386
Hello Jong Wook,
I would like to experiment fine-tuning Onsets and Frames on a custom dataset with your PyTorch implementation.
For that I would ask, is there a pretrained model checkpoint available for your implementation please ?
Then I would format the custom dataset I would like to fine-tune on as the MAPS example: one folder of .flac audio inputs at 16kHz mono one folder of matched .tsv annotation targets (col 1. onset sec. / col 2. offset sec. / col 3. note / col 4. velocity) So that it can be read with PianoRollAudioDataset and used for continuing training a previous checkpoint.
One more thing I would please ask for confirmation regarding the annotation files, the 3rd. note column should be midi pitch (in the range of the 88 piano keys) and the 4th. velocity column should be scaled in which range ? (it doesn't seem to go up to 127 like a midi velocity)
Thanks for sharing the model to Pytorch !