TeamHG-Memex / eli5

A library for debugging/inspecting machine learning classifiers and explaining their predictions
http://eli5.readthedocs.io
MIT License
2.75k stars 332 forks source link

html features: preserve whitespaces #31

Closed kmike closed 7 years ago

kmike commented 7 years ago

Features with whitespaces in front get these whitespaces removed in HTML.

Compare:

+2.837  spa 
+2.805   spa

and

2016-10-21 16 59 04

I think whitespaces should be replaced with   for HTML display. It could also make sense to use another background for text, in order to show whitespaces in the end.

lopuhin commented 7 years ago

I tried some approaches for making whitespace at the end visible: changing background, adding border, replacing it with single and double underscore (https://en.wikipedia.org/wiki/Underscore), white and black square, etc., but at the end solution I like the most is to just put features having space at the start or at the end in double quotes. It's also possible to apply it to just features having space at the end (as a space at the start is visible by indentation level), but for some reason applying this rule to space both at the start and at the end feels more natural. Here is how it looks like: 2016-10-21 19 34 37

kmike commented 7 years ago

The main problem with " is that it can be a part of a feature name, so feature names become ambiguous. Are there unambiguous options which look slightly worse, or do they all look much worse?

lopuhin commented 7 years ago

This was my second best, a double underscore, I almost went with it before thinking of quoting: 2016-10-21 19 23 38 But it's also not strictly unambiguous (we could also just use more fancy quotes).

Let me think of something reasonably looking and really unambiguous.

lopuhin commented 7 years ago

Other options are adding a slight border or background, they look not terribly bad, but what I dislike is that all feature names become affected (I also use emspace here because it is larger and makes it more obvious that there is a gap). 2016-10-21 20 14 49 2016-10-21 20 16 26

lopuhin commented 7 years ago

I dislike is that all feature names become affected

I mean that it would look strange if we just apply these styles to items with space in them - it would seem that they are more important. But if we apply these styles only if there are some features with spaces, then it seems like a reasonable compromise.

kmike commented 7 years ago

what about highlighting only whitespaces, not features themselves?

kmike commented 7 years ago

Inspiration:

lopuhin commented 7 years ago

Thanks for the inspiration @kmike ! In the editor context I liked dots as spaces better, but in our case I think just highlighting spaces looks better. I use just two colors here: darker green for positive and darker red for negative. I tried adjusting highlighting color according to the current weight color (making a darker version of weight color instead of using a constant color), but it looked worse. Here is a notebook with new space highlighting scheme: https://github.com/TeamHG-Memex/eli5/blob/visible-space/notebooks/explain_text_prediction_char.ipynb

kmike commented 7 years ago

Yeah, I agree highlighting looks better. What do you think about making it less bright, e.g. using S=70% and L=70%? 2016-10-24 14 26 17

If the letter which is close to a whitespace is l or other similar letter it is a bit hard to distinguish with dark highlighting:

2016-10-24 14 27 31

kmike commented 7 years ago

What about using light shade instead of medium shade for whitespaces? 2016-10-24 14 32 29

2016-10-24 14 33 11

kmike commented 7 years ago

dots are nice, but sometimes they can look cryptic:

2016-10-24 14 38 08

lopuhin commented 7 years ago

Yeah, light shade looks much better for text!

About the darkness of color for html: I would rather fix it by adding a little margin, because the current darkness is barely visible over red of max intensity (there is already additional margin here but not looking like I would like): 2016-10-24 15 42 20

Maybe lowering the max intensity of red and green could help, I'll check that.

lopuhin commented 7 years ago

Agreed, dots look nice with letters but no so nice with other dots :)

kmike commented 7 years ago

The margin looks good, I like it. Re max. density: text also is not very visible on the darkest red, maybe we should find a lighter color.

lopuhin commented 7 years ago

I raised the minimal lightness, here are the new colors: https://github.com/TeamHG-Memex/eli5/blob/visible-space/notebooks/explain_text_prediction_char.ipynb

kmike commented 7 years ago

Hm, now it became harder to see highlighting in text: for example, is this text classified as positive or negative, and what are important features?

2016-10-24 15 12 27

(it is positive)

lopuhin commented 7 years ago

Right, it became too pale. I partly corrected that problem 6604b2c by using different color schemes for weights in table and text, but I think this particular example is also an indicator of another problem - the longer the text, the paler will the in-text features be compared to the BIAS, which stays the same.