normalize confusion matrix plot

Hvass-Labs / TensorFlow-Tutorials

TensorFlow Tutorials with YouTube Videos

MIT License

9.28k stars 4.19k forks source link

normalize confusion matrix plot #96

Closed Asciotti closed 5 years ago

Asciotti commented 5 years ago

By not normalizing the data prior to plotting, the differences are difficult to see. The only caveat of this is that the colormap labels obfuscate the true data that is being plotted. There may be some tricks that can be done by manually changing the colormap object, similar example here.

Before normalization

After normalization

Hvass-Labs commented 5 years ago

Thanks for the suggestion and the detailed PR description with images etc.

I can see the advantage of using the normalization. It is a good idea and I learned something here. But the downside is that it now looks like there is huge confusion. I would probably have to update the surrounding text to explain this. And then I would probably have to update the other tutorials that also use a confusion matrix. So it's maybe not as simple as just changing a few code-lines.

So I'm thinking about just leaving it as it is. I have closed this issue for now. But thanks again.

Asciotti commented 5 years ago

That is fair given the other contextual information that needs to be changed, youtube videos and other future usages of the confusion matrix. I plan on going through all of the examples/videos so possibly I can expand upon the PR in the future to incorporate those contexts and explicitly state this is for visual guidance only.

I think the confusion matrix is designed a tool to assist scientists to debug their classifications and the ability to easily distinguish poorly performing classification (normalized) should be able to coexist with a way to easily distinguish the magnitude of these poorly performing classifications (non-normalized). Just a flaw of trying to simplify so much information into so few dimensions.

Hvass-Labs commented 5 years ago

I think we should just leave it as it is. The main focus of the tutorials is on TensorFlow, so we should be careful not to include too many complicated topics. It will just confuse people even more. But I do appreciate your suggestion and eagerness to help out.