Closed microDM closed 4 years ago
I had the same problem when I tried to compare 6000-to-6000 genomes. I ended up batching my analysis into 6000 x 1-to-6000 comparisons, then concatenating the results. If I remember well each 1-to-6000 analysis (1 query, 6000 refs) required ~80 GB of memory with default parameters. This also allowed to parallelize the analysis on a big cluster.
The resulting tsv is pretty easy to parse so you can easily reconstruct the --matrix
output afterwards.
Thanks @nigiord
I think it will be possible. I will give it a try.
Please create a new issue if this was not resolved.
For a collection of 100,000 reference genomes, a 700 GB memory will be needed at least.
When I am comparing ~10,000 genomes, on my server having 24 cores with 128 GB RAM, the fastANI process is getting aborted.
terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted (core dumped)
What will be memory requirements when I want to compare ~10,000 genomes?I am using fastANI v1.1 My command:
fastANI --refList reference.list --ql query.list -t 10 -o out --matrix