Closed dwhswenson closed 3 years ago
For discussion: is the query the x axis or the y axis?
Assuming query is smaller than haystack; I personally prefer small/wide figures more than tall and skinny ones. So I would vote for query as default y (with an option from the user to maybe give x
= all
, query
, or haystack
(and the same for y
).
What would you opinion be? (as you more often use these figures and probably intend to publish them)
Probably best to show a real-world example of each type:
This is a loop region interacting with the entire protein, with similarities and differences in two parts of a trajectory. I was originally using query ~as x~ [EDIT:] as y (similar logic to yours) but I realized that I was thinking about it like a function that takes a query residue as input and outputs locations with interactions. This is why I'm curious for other opinions.
So I would vote for query as default y (with an option from the user to maybe give
x
=all
,query
, orhaystack
(and the same fory
).
I don't want to add more parameters to the plotting function (in fact, I'd kind of like to remove some parameters; replace vmin
/vmax
with a bool diverging_cmap
). However, I do think it would be very reasonable for the contact count to have something like total_range
(or topology_range
), in addition to query_range
and haystack_range
. It's easy enough then for users to get the current behavior with:
fig, ax = residue_contacts.plot()
ax.set_xlim(*residue_contacts.total_range)
ax.set_ylim(*residue_contacts.total_range)
Switching axes from the default would be similarly easy. This whole issue is something that is already not too hard, but I think our defaults are not optimal. (Also, it is annoying that I get a dpi warning when I know I'm going to be zooming in enough).
Alright, looking at the real data, I find the top two more readable, so query on the x
axis now has my vote.
(I was somehow thinking about scaling the figure size, not the data within the same aspect ratio. So having query as x
gives the most 'short and wide' data representation.)
This whole issue is something that is already not too hard, but I think our defaults are not optimal. (Also, it is annoying that I get a dpi warning when I know I'm going to be zooming in enough).
That is fair.
One thing I just now noticed, we should probably off-center the labels slightly, so they overlap with the center of the corresponding block, instead of the right(?). What do you think about that?
One thing I just now noticed, we should probably off-center the labels slightly, so they overlap with the center of the corresponding block, instead of the right(?). What do you think about that?
Nope. I had though about changing that, but realized there's a reason not to: left side is resid, which is the only number guaranteed to be unique. But that means that right side is typically resSeq. So we plot between the two! 😉
Another option would be to rewrite the xticks so that they are the resSeq (as a string). I believe we could then have repeated resSeq, and then we could center on resSeq (which is what would be expected). Centering on resid is likely to lead to users reporting residues that are off by one.
Another option would be to rewrite the xticks so that they are the resSeq (as a string). I believe we could then have repeated resSeq, and then we could center on resSeq (which is what would be expected). Centering on resid is likely to lead to users reporting residues that are off by one.
That would definitely be a worthwhile addition (I think)
I think it could add confusion. Think of the KRas example, which jumps a bunch of residues. How do you handle the xticks there?
I'm also not keen on too much more matplotlib-specific stuff. I think moving toward holoviews (after we get to 1.0) would be interesting. What I really want, frequently, it to hover over a contact and get the name of the residues involved as well as the value of the frequency. I don't think matplotlib can do that, but bokeh can (and holoviews is kind of a backend-independent adapter).
That is a fair point, let's keep a (possible) x-tick remapping out of this issue. Also the holoviews stuff looks pretty sweet and suitable
I find that I'm frequently doing something like:
This zooms my plot into just the query-by-haystack region. This should be the default behavior.
To get there, I'm suggesting the following:
n_x
,n_y
inContactCounter
byquery_range
andhaystack_range
. The original idea of separaten_x
andn_y
was to do something like this, but the original implementation didn't work, and now everything just usesn_x = n_y = n_total
(wheretotal
can beresidues
oratoms
).autozoom=True
to the.plot
and.plot_axes
methods..plot
to know what the autozoomed size will be.ContactObject.query_residue_range
,ContactObject.haystack_residue_range
. They exist entirely for this purpose.For discussion: is the query the x axis or the y axis?