FredHutch / glam-browser

Gene-Level Association of Microbiomes - Browser
1 stars 1 forks source link

Add single-CAG gene-level heatmap #1

Open sminot opened 4 years ago

sminot commented 4 years ago

Visual display of the abundance of genes within a single CAG, with the option to color specimens by metadata

sminot commented 4 years ago

Possibly order by KO or taxonomic annnotation

sminot commented 4 years ago

FYI @jgolob, the reason I haven't implemented this feature yet is that I'm struggling with the decision on how to structure the input data. Currently GLAM is able to run using the "summary" HDF5 produced by geneshot. This is a subset of the total data which omits some of the largest data objects. While the "full" HDF5 may be 6 GB, the "summary" HDF5 may be only 600 MB, which is much more portable and convenient.

One option would be to include the gene-level abundance table in the "summary" HDF5. With a back-of-the-envelope calculation, if we're talking about 2M genes across 200 samples, with each value taking up 2 bytes, then the size of that table will probably be around 800 MB, or doubling the size of the "summary" HDF5.

Another option would be to table this idea until a later date when the input data for the GLAM Browser is formatted in a more web-native way, with each individual element stored as a discrete data object, and not all stuffed together into a single HDF5.

Would love to discuss this question in more depth.