jordenrabasco / Long_read_processing_tutorial

2 stars 0 forks source link

Ecoli- abundance counts #18

Open jordenrabasco opened 2 years ago

jordenrabasco commented 2 years ago

This section prints off a table that has the abundance counts of the Ecoli identified strains in a particular sample. I included this to show the user how to pull out exact counts as I thought that maybe useful.

https://github.com/jordenrabasco/Long_read_processing_tutorial/blob/afa1a962b305b79b0473a644cd9133a992bfa9ea/long%20read%20Tutorial.Rmd#L253-L256

benjjneb commented 2 years ago

How would a tutorial reader use this section to better understand their own data?

What would they need to know in order to interpret the results they would get from a totally different set of samples?

jordenrabasco commented 2 years ago

commit referencing this issue b3f0ed0656181d6071c0ff8c6b3c1481b97d1386

I would like to double check to make sure my language/understanding surrounding the relative abundance presented in the table is accurate and if you think its at a level the user could understand.

benjjneb commented 2 years ago

That commit seems to have entirely replaced and renamed the file, so it's hard to see what changed.

Can you copy/paste the current language into this Issue thread?

jordenrabasco commented 2 years ago

Totally my fault I linked the wrong commit b35a4202a3cfdbf64062d35c61582224f8342750 should be the right one

benjjneb commented 2 years ago

The new text describes the plot, but still doesn't give any information on why "the plot comes our exactly as we predicted". What did we expect and why? What would be an unexpected plot? How can people decide if a version of this plot in their own data was as expected or not?

jordenrabasco commented 2 years ago

Just a quick question: in a past tutorial on your github you mentioned "high abundance strain with a 4:1:1:1 full complement of Ec2:Ec13:Ec14" below a similar graph (https://benjjneb.github.io/LRASManuscript/LRASms_fecal.html). Does this mean that the Ecoli strain present had that many operons present so that's how many ASVs we would expect to see? Meaning that there are Ec2 loci for every one of the others?

benjjneb commented 2 years ago

We know E. coli has 7 copies of the 16S rRNA gene. So that text means that the 16S data shows that 4 of those copies are the same allele or ASV (EC2), while the other 3 copies have different alleles (i.e. different full-lenght sequences). So there is a 4:1:1:1 ratio between the abundances of those different alleles.

jordenrabasco commented 2 years ago

updated with the relevant info about the graph and how to analyze it. From my understanding it is also not uncommon for an ecoli to have multiple alleles within the same genome right? Which would mean the exact duplicates of 16S alleles would be not as common?

benjjneb commented 2 years ago

Link to update?

jordenrabasco commented 2 years ago

Link should be above my previous comment, I think? If not I linked the relevant commit here as well: 213f965f8ab580e1527237917725ebb0c7c65365