Closed francicco closed 1 year ago
Dear @francicco,
How many sequences do you have and how long are the genomes?
Best, Sergei
it's about 60 species, but I can reduce it, it seems very very slow. The genome alignment should be around 450Mb F
Dear @francicco,
GARD
was developed with viral genomes / gene family situations in mind, so something like 1-50kb long. One way you can speed up computation for longer alignments is through the use of ENV="OPTIMIZE_SUMMATION_ORDER=0"
as a command line argument, but that's not gonna make a major difference. I would also use --mode Faster
.
Reducing the number of species to 20-30 is probably going to help as well (represent the closely related genomes with a few representatives), and using sliding windows of 10-20kb might be the way to go.
Best, Sergei
Great tips! Thanks a lot!!! F
Stale issue message
Hi,
I was wondering if makes any sense to run GARD on non-overlapping sliding windows of whole genome alignment. The alignment includes very closely related species and less related ones.
Never run these analyses so far, I'd need some guidance.
Thanks a lot F