Open anthonio9 opened 9 months ago
Well, the new config is almost ready for training, there are some problems left for tomorrow, like there's still no metrics or the offset is calculated on a cpu instead of the gpu. Fix it!
So far been trying to teach the network descrete midi notes, it seems to work rather well, so far the results are like below:
Seems like lower number of logits is not necessarily the same as better accuracy.
Take a look here to resolve the problem with images not being logged with wandb https://github.com/wandb/wandb/issues/1252
This issue is about the introduction of a new network model, where the logits layer instead of 61440 pitch bins has 660*2 midi bins. Each of the 60 values in the first column corresponds to a separate midi note. The second column tells how big the pitch deviation is from that midi note, 0 being -50, 59 being +50 cents.
Conversion to this new format is done with
penn.data.preprocess.core.note_dict_to_pitch_dict60()
where 60 obviously stands for the number of midi notes.