dwhswenson / contact_map

Contact map analysis for biomolecules; based on MDTraj
GNU Lesser General Public License v2.1
42 stars 18 forks source link

Automatically detect diverging color maps #110

Closed dwhswenson closed 3 years ago

dwhswenson commented 3 years ago

The vmin and vmax kwargs for the plotting functions give a lot of flexibility, but aren't actually good for user experience. Technically, they allow you to set any range as the range of your color scheme, but in practice it's always vmin=-1 by default (using diverging color maps) or vmin=0 if you use a sequential color map. And vmax=1 in all cases I'm aware of.

This feels particularly off when people use sequential maps without changing the default vmin. They end up using only half the color space, and the results are likely to be hard to interpret. And it's weird that the docs have to specify "do this extra thing with an obscure label for most of the colormaps you're familiar with from matplotlib."

This PR breaks the current plotting API by removing vmin and vmax, and replacing them with a boolean option diverging_cmap. If (as is default) diverging_cmap is None, it gets set by looking into the list of diverging colormaps listed in the matplotlib documentation. It also checks for sequential maps, raising a warning if the user-provided cmap is neither. Then:

vmin, vmax = (-1, 1) if diverging_cmap else (0, 1)

Here are the potential downsides I can think of:

  1. This will issue a warning for unknown color maps. That can be custom colormaps, which might annoy advanced users. We should make the warnings easy to filter for such users. [EDIT: It will only issue the warning if the user doesn't give a value to the diverging_cmap argument.]
  2. There might be users who really were using values other than vmin in [0, 1] and vmax=1. This could be used to either use a smaller subset of the color map (since the actual data is always between -1 and 1), or to "clip" information about larger/smaller values (i.e., anything over 0.9 is the same color). For these users, I think the matplotlib documentation on creating colormaps will give them the tools they need to reproduce this with custom color maps.
  3. If a user was previously using a custom diverging color map, other defaults worked. Now the user will have to explicitly set diverging_cmap=True when calling the plotting function.

@sroet: Opening this early for any conceptual feedback.

codecov[bot] commented 3 years ago

Codecov Report

Merging #110 (dd4b79e) into master (3089eba) will not change coverage. The diff coverage is 100.00%.

:exclamation: Current head dd4b79e differs from pull request most recent head 8ce83f8. Consider uploading reports for the commit 8ce83f8 to get more accurate results Impacted file tree graph

@@            Coverage Diff            @@
##            master      #110   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           13        13           
  Lines         1127      1140   +13     
=========================================
+ Hits          1127      1140   +13     
Impacted Files Coverage Δ
contact_map/contact_count.py 100.00% <100.00%> (ø)
contact_map/plot_utils.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3089eba...8ce83f8. Read the comment docs.

sroet commented 3 years ago

@sroet: Opening this early for any conceptual feedback.

Conceptually I agree this is a good thing.

I am a bit unsure about the solution for point 2 (as in: I would not quickly know how to clip it without access to the Normalize function). However, adding the code for that to the custom plotting example should be sufficient to cover that issue.

dwhswenson commented 3 years ago

I am a bit unsure about the solution for point 2 (as in: I would not quickly know how to clip it without access to the Normalize function).

As an example, say you want to clip at 0.9 in a sequential map (range is 0 to 1). This is the same as assigning the same color to the top 10% of the color map. There are a couple ways to accomplish that, but this example using LinearSegmentedColormap.from_colors shows what might be the easiest. Just get the desired cmap with a certain number of colors, append a copy of the last color, and set the nodes as list(np.linspace(0, 0.9, n_colors)) + [1.0].

dwhswenson commented 3 years ago

Note on progress: This is completed except for the changes to the notebooks. Changes to be made:

  1. Rewrite some stuff about sequential vs. diverging maps in custom plotting to better fit current status. Remove vmin mention.
  2. Possibly add an example "clipping" so that anything with values below 0.1/above 0.9 show as the same color at 0.0/1.0 (i.e., make and use a custom color map with explicit use diverging_cmap).
dwhswenson commented 3 years ago

Ready for review. Includes

Also bumped our copyright year in docs, since I noticed that was out of date.