blekhmanlab / compendium_website

Website for the Human Microbiome Compendium
http://microbiomap.org/
0 stars 1 forks source link

Histogram of read totals #10

Closed vincerubinetti closed 1 year ago

vincerubinetti commented 1 year ago

Unfortunately I only have vague notes from slack for this, and I forget the details we discussed in previous zoom meetings. (In the future, let's create detailed issues on the fly as soon as we think of ideas.)

library size? sampling effort? sum of all columns for a given sample, histogram of reads totals

~And I don't know if this related:~

~A more developed idea of how to present a stacked bar chart~

I'll need a lot more explanation/reminders to be able to implement this. Again, what would be helpful is a sketch (or maybe just a figure from the manuscript), marked up with explanations of how the data is laid out in the viz and how the data structures are processed to get the viz.

rabdill commented 1 year ago

The stacked bar chart is separate, and hopefully the histogram is the simpler of the two. Here's an example of one panel from the paper: Screenshot 2023-07-12 at 4 50 09 PM

In the dataset as-is, each sample is represented by an entire row of numbers, each one indicating how many reads were classified as a particular taxon. If we add all these numbers from a single row, that would be that sample's library size, which is the thing we'd be visualizing here. So we'd end up with one number for each of the 168,000+ samples, and this histogram would illustrate how many samples fall into each bin. The values have a minimum of 10,000 and a maximum of around 10 million, but the illustration above shows that values above 1 million are very rare.

Using a log scale on the x-axis, as above, would be helpful, if it's practical—otherwise, if the x-axis goes up to 10 million and the median value is 40,000, everything will end up smushed up along the left axis. Displaying the median value would also be handy. This is also one of the visualizations where filtering by region would probably be of particular interest.