kevsilva / StrainFLAIR

Strain-level abundances estimation in metagenomic samples using variation graphs
GNU Affero General Public License v3.0
25 stars 5 forks source link

Description of columns in strains_profile.csv #7

Open youyuh48 opened 1 year ago

youyuh48 commented 1 year ago

Thank you for developing a great tool.

What do the columns in strains_profile.csv mean?

detected_genes,mean_abund,mean_abund_nz,median_abund,median_abund_nz

What does "_nz" mean above? Thank you.

kevsilva commented 1 year ago

Hi,

Thank you for using StrainFLAIR.

Since the abundance of a colored path is computed from the abundance of the nodes the path is composed of, I tried several methods: the mean or the median, with or without the nodes with an abundance of zero (which could underestimate the abundance of the path if the depth is not enough). Hence, "nz" strands for "non zeros", meaning it is the computation with only nodes with abundance > 0. For the paper I only used the column "mean_abund", but I left all the metrics for the output.

Regards