dw1227 / bhatti_rrnanalysis_2024

Analyze utility of ASVs in 2024
MIT License
0 stars 0 forks source link

Measure specificity of ASVs for each taxonomic rank and region of 16S rRNA gene #30

Closed dw1227 closed 2 months ago

dw1227 commented 2 months ago

If I have an ASV, what's the probability that it is also found in another taxonomic group from the same rank? For example, if I have an ASV from Bacillus subtilis, what's the probability that it is also found in Bacillus cereus? Of course, it is more likely to find a Bacillus subtilis ASV in a more closely related organism like Bacillus cereus than E. coli. We may adjust/control for relatedness later but let us now answer the general question for any two taxa from the same rank.

dw1227 commented 2 months ago

We need to control for uneven sampling of species. Our previous analysis has suggested that a decent number of species had 5 or more genome sequences. Let's see how the results change when we sample 5 genomes randomly from each species. It would be nice to modify that number in case we want to use something different (e.g. 2)

dw1227 commented 2 months ago

We can get lucky or unlucky in the 5 genomes that we picked for the analysis described above. Therefore, let us repeat the analysis 1000 or more times to get a more robust estimate of the overlap along with 95% confidence intervals.