liaoherui / StrainScan

High-resolution strain-level microbiome composition analysis tool based on reference genomes and k-mers
https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-023-01615-w
MIT License
39 stars 5 forks source link

Which column of final_report.txt should be used for strain abundance? #23

Open haihao999 opened 1 month ago

haihao999 commented 1 month ago

Hi, In this result, should I select Predicted_Depth? Strain_ID Strain_Name Cluster_ID Relative_Abundance_Inside_Cluster Predicted_Depth Coverage Covered/Total_kmr

Why is there no Predicted_Depth (Ab*cls_depth) column in my final_report.txt result?

liaoherui commented 1 month ago

Hi, thanks for using StrainScan!

For the first question, yes. The Predicted_Depth and Coverage columns can be used to infer the abundance of the identified strains.

For the second question, the possible reason is that the tool only performed cluster-level identification, meaning all identified strains belong to clusters with a size of 1. In this case, the Predicted_Depth (Ab*cls_depth) column is not provided, as it is only calculated for identified strains from clusters with a size greater than 1.

haihao999 commented 1 month ago

Thank you very much for your reply. How should I choose if I encounter the following situation? Coverage Predicted_Depth 0.98 26.66 0,93 9.9 0.72 7.73

liaoherui commented 1 month ago

You should choose "Predicted_Depth" if your goal is to estimate the abundance of identified strains. "Coverage" here roughly reflects the percentage of genomic regions covered by k-mers.