Closed Asciotti closed 5 years ago
Thanks for the suggestion and the detailed PR description with images etc.
I can see the advantage of using the normalization. It is a good idea and I learned something here. But the downside is that it now looks like there is huge confusion. I would probably have to update the surrounding text to explain this. And then I would probably have to update the other tutorials that also use a confusion matrix. So it's maybe not as simple as just changing a few code-lines.
So I'm thinking about just leaving it as it is. I have closed this issue for now. But thanks again.
That is fair given the other contextual information that needs to be changed, youtube videos and other future usages of the confusion matrix. I plan on going through all of the examples/videos so possibly I can expand upon the PR in the future to incorporate those contexts and explicitly state this is for visual guidance only.
I think the confusion matrix is designed a tool to assist scientists to debug their classifications and the ability to easily distinguish poorly performing classification (normalized) should be able to coexist with a way to easily distinguish the magnitude of these poorly performing classifications (non-normalized). Just a flaw of trying to simplify so much information into so few dimensions.
I think we should just leave it as it is. The main focus of the tutorials is on TensorFlow, so we should be careful not to include too many complicated topics. It will just confuse people even more. But I do appreciate your suggestion and eagerness to help out.
By not normalizing the data prior to plotting, the differences are difficult to see. The only caveat of this is that the colormap labels obfuscate the true data that is being plotted. There may be some tricks that can be done by manually changing the colormap object, similar example here.
Before normalization
After normalization