Closed LudoPoire closed 2 months ago
Hi Ludo,
Thanks for your interest in our tool. Using multiple reference strains can better capture the genetic diversity and improve the accuracy in the associations. However, in some cases a single well-chosen reference strain-often the reference lab strain- might be enough and can simplify the interpretation of the analyses. For populations with high genetic diversity, such as E. coli, multiple strains are recommended. There's no real suggestion for a good reference really. We would suggest to pick more than one and pick whichever makes sense for the phenotype being studied as the "main" reference.
A single reference strain enrichment_reference
is used to perform the enrichment analyses for the associated genetic variants. While multiple strains are used to annotated --focus-strain
and summarize --reference
the hits.
Hi and thanks for your answer ! So you recommend having multiple strains from different origins/lineages to match the diversity present in a collection ? Let's say one or few lab strains, plus some from the environment and/or clinical strains if the species is present in multiple habitats ?
Yes 👍🏽
Hi, I'm looking towards using your program for bacterial GWAS for multiple phenotypes. I was wondering if you had some advice regarding the choice of reference strains (for both associations and annotation). Should I have one or multiple strains, and how to choose them ?
I see you chose IAI39 as "enrichment reference" and multiple E. coli strains with the flags --reference and --focus-strain for bootstraping, could you provide some details about what these parameters are about ? I'm working with a collection of A. baumannii strains fyi.
Thanks for your answer and for developing this tool !