maggimars / Tara-Phaeo

0 stars 0 forks source link

plot %mapped genes that are core v. unique or shared but not core at each location on the map #10

Open maggimars opened 3 years ago

maggimars commented 3 years ago

Re: half-baked, poorly articulated idea: I was thinking that for the upset plot instead of plotting the number of genes in the group potentially looking at the relative abundance in metaT / metaG space of each of the groupings (or perhaps normalized abundance for # of genes) for a given site. As in, sum the counts from the various genes into the groupings based on the figure in #2.

Originally posted by @halexand in https://github.com/maggimars/Tara-Phaeo/issues/7#issuecomment-756376370

---> based on idea that the overrepresentation of metaG mapping to P.antarctica has to do with the fact that the genomes contain genes that are not expressed at high enough levels to be included in low-coverage transcriptomes -- expect to see higher proportion of non-core or unique genes in metaG mapping and higher proportion of core genes in metaT mapping

maggimars commented 3 years ago

..... I'm pretty sure this exactly the opposite of what we were expecting .....

orthomaps

One caveat, this only includes the 2 jgi genomes (globosa and Antarctica) and the 3 transcriptomes I sequences (globosa, jahnii, and cordata) because the MMETSP transcriptomes had very different fasta headers for the peptide and nucleotide files and lining up results from ortherfinder and salmon was :-/ . The MMETSP transcriptomes recruited a small number of reads compared to the others so I think this still paints the overarching picture. But, I am working on the header issue in the meantime.