matsengrp / cft

Clonal family tree
5 stars 3 forks source link

Memory problems with dendropy implementation of minadcl_clusters.py #225

Closed metasoarous closed 6 years ago

metasoarous commented 6 years ago

It seems that we're overflowing as we create the distance matrix. I've improved the situation some by exercising more control over how we submit jobs, setting memory requirements and exclusive access for these jobs. With more cluster tinkering, I may be able to get this to work (been in touch with scicomp about it).

matsen commented 6 years ago

Ugh. Do you know about nw_distance from the Newick utilities? It would require an extra parsing step, but it might be worth trying out on your big trees.

metasoarous commented 6 years ago

Good idea! I'll give that a whirl and see how it does on memory. Thanks.

metasoarous commented 6 years ago

I may have mostly resolved this now by submitting the bigger jobs to the largenode partition. Turns out I wasn't doing this right initially, because of other constraints that have to be added to the srun invocation for those nodes to be available. There are still a few jobs failing, but I'm having trouble finding them because the scons build log overflowed my tmux pane buffer :-/ So still some hunting to fully resolve this, but we've at least got the laura-mb dataset built now.

metasoarous commented 6 years ago

As clarified in #226, the last remaining issues aren't actually with the minadcl_clusters.py but upstream of that with the invocation of process_partis.py. All of the clusters for which the process_partis.py issue isn't hitting us seem to be making it through minadcl_clusters.py now as a result of 653b20b (push pending).