Closed yjiakang closed 4 years ago
Hello!
Great question. An ideal reference genome would be a historical isolate that is a common ancestor to your strains of interest. For example, if you are looking for antibiotic resistance mutations, an isolate that predates the widespread use of antibiotics would be ideal.
I would use a tool like FastANI to determine how closely related your 200 isolates are to a selection of possible reference genomes. If they are all above the 98.5% threshold, just use one reference genome for all 200 isolates. If there is more diversity than this, you may want to use a different reference genome for the different clades, but this would make it more difficult to compare the results across the five clades.
I hope that helps! Let me know if you have any more questions.
Matt
Thanks for your valuable suggestions, fastANI showed that all my 200 genomes similarity are above 98.5% threshold. jk yin
Hello, thanks for the wonderfull tool you have developed for exploring MGE! As you mentioned in your manuscript, the choice of reference genome is important (isolates should share at least 98.5% nucleotide identity with the reference genome), so before analyses, I want to consult you how should I choose my reference genome. Here I have downloaded ~200 genomes of one bacterial species, and from the phylogenetic tree I found that they were divided into five clades, so I want to do some analyses about mobile genetic elements of the five clades, and compare them.
I intend to perform MGE analysis of the five clades separately using your
MGEfinder
, however, I am a little confused about which genome can be used as the reference genome for each clade, can you help me? Thanks in advance. Best, jk yin