choderalab / ensembler-manuscripts

Manuscript for Ensembler v1
0 stars 3 forks source link

Figure: distribution of remodeled loop lengths #25

Closed danielparton closed 9 years ago

danielparton commented 9 years ago

Here is an updated version of the figure, as per John's suggestions (copied below):

Note that I removed the rug plots ("ticks"). Also note that the rug plot is something implemented in seaborn which simply plots a line (tick) of given length for each data point, along the x-axis. If there are multiple data points with the same value, the corresponding tick does not change in length or thickness.

Previous version:

John's comments from the manuscript:

Some ideas for cleaning this up: Either the tick marks are being misrendered in that they are not taller if there are multiple data points with the same number or the data is really funky, since I would expect there to be a few ex- amples in some bins. Also, is there a big drawback to making the top histogram bin size unity, since the values are integral? I don’t think transparency is needed for the histogram bars either. We can also ditch the semilogy axis. I would also make the x-axes for the top and bottom plots different, since the data ranges are different. I’d see if a histogram with unit bin size might be more appropriate for the bottom plot as well—the KDE just doesn’t feel right for this kind of data, since we are trying to report exact statistics from a specific example rather than estimate a general density for prob- lems of this sort. Finally, I like “remodeled loop length” or “missing loop length” much better than “Number of missing residues per template gap”, which seems unnecessarily verbose.

danielparton commented 9 years ago

New version following offline comments with John:

jchodera commented 9 years ago

Do you think we need to label the y-axes?

danielparton commented 9 years ago

Yes, I think this helps. Updated:

jchodera commented 9 years ago

Thanks!