bmcfee / ismir2017_chords

ISMIR 2017: structured training for large vocab chord recognition
BSD 2-Clause "Simplified" License
48 stars 3 forks source link

Inconsistent confusion matrix #3

Open instr3 opened 5 years ago

instr3 commented 5 years ago

We ran through the file 'notebooks/04 - Confusions.ipynb' without any modification except for the absolute data file path, and we got the following confusion matrix on the pretrained model CR2+S+A on the 1217 dataset. image

The diagonal classification accuracy for aug, dim7 and maj7 chords is much lower than the proposed one in the paper, like the following: image

We executed the notebook until this line to get the confusion matrix: image

Thank you very much!

bmcfee commented 5 years ago

Thanks for looking into that. I'm not sure what's going on there, but I think it's not a problem with the plotting code. The confusion matrix you're showing doesn't seem random in the sense that the types of errors are consistent with what you would generally expect: bias toward simplification.

Just as a sanity check, you may also want to double-check the root confusion plot, just to make sure it's not doing anything obviously wrong.

It might help if you could explain in more detail where exactly your data is coming from. By "pre-trained model", are you referring to the crema package, or the saved weights per-fold contained in this repo?

instr3 commented 5 years ago

Hi Brian,

Thank you for your fast response!

  1. By pre-trained model I mean the saved weights per-fold contained in this repo.
  2. The 1217 dataset is acquired from you and Juan, originally from NYU HPC.
  3. The root confusion matrix seems okay to me. image

It seems to me the same thing: the model output is more biased towards simple chords than expected.

I noticed that there is a file 02 - Chord model prototype - weighted.ipynb that conducted weighted training to resolve the class unbalance problem, but no pre-trained model is provided and I have not yet tried to train one. Could it be the case that the confusion matrix actually comes from the model that uses weighted classes for training? (It makes sense to me as we also adopted a similar solution)

bmcfee commented 5 years ago

Thanks for checking up on it. I'm really not sure what's going on there-- the results should be identical. Are you sure this is loading the +augmentation results?

Could it be the case that the confusion matrix actually comes from the model that uses weighted classes for training?

No -- that notebook was a prototype that we eventually abandoned prior to the final paper.

instr3 commented 5 years ago

Hi Brian,

Sorry for the late reply.

I reran the notebook in a clean environment and the problem still exists. I doublechecked that the model is the one with +augmentation.

I will provide more detailed information below. In the notebook 04-Confusions.ipynb, I checked the diagonal sum of the T_CR2 matrix and the T_CR2s matrix, which correspond to the model_deep_aug and model_deep_struct_aug respectively.

These are values if I got by rerunning the notebook:

image

These are pre-calculated values left in the notebook:

image

It can be seen that the diagonal sum of T_CR2 matrix is mainly consistent with the pre-calculated ones but the diagonal sum of T_CR2s is very different.

instr3 commented 5 years ago

Hi Brian,

After some investigation, we found an earlier version of this repo, whose model is consistent with the result in the paper:

https://github.com/bmcfee/ismir2017_chords/tree/c9c2bfd0e7e5feba7ad2605522487666fa610a43/data/model_deep_struct_aug

We are using this version of the model now. Seems that everything works fine, and the confusion matrix looks great:

image

However, it is still unclear to me what is the difference between the up-to-date version of the model and the early version of the model.

bmcfee commented 5 years ago

Ok, that's a relief! I'm not sure what the difference is here either, but I'm glad you got it sorted out for now. I'll have time to dig into this in more detail in a few weeks, once the deadlines settle down. I hope this stuff is still useful to you!