veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
205 stars 69 forks source link

[GARD] Recombination on a sliding windows #1534

Closed francicco closed 1 year ago

francicco commented 1 year ago

Hi,

I was wondering if makes any sense to run GARD on non-overlapping sliding windows of whole genome alignment. The alignment includes very closely related species and less related ones.

Never run these analyses so far, I'd need some guidance.

Thanks a lot F

spond commented 1 year ago

Dear @francicco,

How many sequences do you have and how long are the genomes?

Best, Sergei

francicco commented 1 year ago

it's about 60 species, but I can reduce it, it seems very very slow. The genome alignment should be around 450Mb F

spond commented 1 year ago

Dear @francicco,

GARD was developed with viral genomes / gene family situations in mind, so something like 1-50kb long. One way you can speed up computation for longer alignments is through the use of ENV="OPTIMIZE_SUMMATION_ORDER=0" as a command line argument, but that's not gonna make a major difference. I would also use --mode Faster.

Reducing the number of species to 20-30 is probably going to help as well (represent the closely related genomes with a few representatives), and using sliding windows of 10-20kb might be the way to go.

Best, Sergei

francicco commented 1 year ago

Great tips! Thanks a lot!!! F

github-actions[bot] commented 1 year ago

Stale issue message