Closed adobe7105 closed 10 months ago
Hi Zemin,
Just a couple of questions to make sure I understand-
1) How similar are the genomes in your reference genome database?
2) Are you mapping to all 18 species at the same time in all samples?
3) Are you using metaphlan4 in order to "prescreen" samples, so that you don't have to run inStrain on all samples?
Best, Matt
Hi @MrOlm, My research focuses on finding signature snv markers in high abundance strains of disease populations。 According to my understanding, the number of snv detected on the gene is related to the relative abundance and sequencing depth of the strain。 Therefore I used the following research process:
Hi Zemin,
OK- I now understand your research question.
WIth regards to your original question, I think your command is great and there's no need to adjust the min_read_ani. In your analysis you want to detect SNVs, and adjusting the min_read_ani will just hamper that goal.
The only other comment I have is to many not standardize the number of SNVs detected in that way. The problem is that sequencing depth doesn't always lead to more SNVs detected, so doing that will underestimate the number of SNVs detected in high coverage genomes. If you set a minimum detection depth at 10x coverage, and only look at SNVs at at least 20% abundance, that should go a long way to correcting for biases due to sequencing depth.
Best, Matt
Hi @MrOlm, I have some confusion about the parameter settings of the instrain. If you have any suggestion they will be precious.. here is my workflow,First,calculate relative abundance of strains using metaphlan4, then,selected the strains with an average relative abundance greater than 0.5% for SNV annotation to ensure high quality. All selected reference genomes are downloaded from ncbi
With my parameters below, profile: --min_genome_coverage 10 --min_read_ani 0.95 --skip_plot_generation --skip_mm_profiling
Since my resaerch is based on a small reference genome(including 18 species), should I adjust ani to 0.98?
bests, zemin