Open thinkh opened 6 years ago
@NMO13 please add a sketch
also see sketch on slide 46 in https://docs.google.com/presentation/d/1G5E-HJ_PRDsJ90YH6vd2Kf4L6wR-_SjkRns3BUD8u5g/edit#slide=id.g3314dfedfb_0_0
@NMO13 You can start with dummy data until the API returns real softmax data. The API is defined in the swagger.yml.
@NMO13 If possible, use the Phovea heatmap. You can see the usage of the heatmap in TACO.
I'm aware that we limited the scope of malevo
to the 10 class problem, but the softmax stamp approach won't scale to settings where we have (much) more than 10 classes. I doubt that we'll be able to see anything in these large stamps --> maybe we should already think about other marks.
aside from the perceptual issues, we're already running into scalability problems on the backend. I setup a first database with the softmax values in long format for one (1) run and I'm already around 2GB (only considering every 5 epochs). If we have more epochs or more classes the database sizes will explode.
The question is now, do we really need the softmax values or can we get away by only using the loss value for all instances (e.g. cross entropy) and visualize how the loss per instance changes over the training.
The big advantage would be that it's more general as it will work with any type of loss function and we need way less data (only one tenth for cifar10 and even way less for datasets with more classes).
By looking directly at the loss we could even detect instances that mess up the learning during training.
Essentially we should focus on encodings that allow us to detect outlier instances
Any thoughts?
Looking at the loss value instead of the softmax sounds reasonable. @HendrikStrobelt, what do you think?
the softmax stamp won't work out of the box if we aggregate multiple classes to groups (they also don't make much sense for groups of classes).
It's also not immediately clear how we could aggregate the softmax stamps for groups. Some possibilities I thought of:
We could also pursue a multi-level approach - first visualize "between-group softmax" and then zoom into "within-group softmax"
But it would be better to come up with a visualization which supports groups and classes natively.
That's true but I think we have the same kind of problems for other detail views as well. We have to redesign the detail views for hierarchical classes / classes that have more than 10 classes. I will create an issue for that.
To discuss: How to change all detail views to apply the hierarchical structure?
Out of scope for the paper. Hence, moved to icebox.