AdmiralenOla / Scoary

Pan-genome wide association studies
GNU General Public License v3.0
147 stars 35 forks source link

How to filter Scoary Results #82

Closed noahaus closed 3 years ago

noahaus commented 4 years ago

I've used Scoary to decipher COGs that might have different associations between Host Species, and everything worked like a charm. But now, I'm unsure of what columns I should use to extract the best observations. Sensitivity, Specificity, Odds ratio and the multiple p values that were outputted vary in interpretation (High Odds ratio, Low Sensitivity). If I want to prune the results, which column should I care the most about and filter?

VadimDu commented 4 years ago

Hi @noahaus,

I would filter based on Odd ratio and FDR-adjusted p-values (Benjamini_H_p or Bonferroni_p). Also based on your specific metadata, I would focus on specific groups / host species of a particular interest.

AdmiralenOla commented 3 years ago

Hi @noahaus,

Unfortunately it is hard to give a general answer to your question. Your specific research question matters here, as does knowledge about underlying biology. My gut feeling is that for COG enrichment in particular species, you might be more interesting in odds ratios since it will give you a measure of the degree of the enrichment. P-values on the other hand, specifically relate to the null hypothesis that your COGs are equally distributed across host species.

I will close this issue since it does not relate directly to Scoary's functionality, but feel free to contact me if you need help.