ekg / seqwish

alignment to variation graph inducer
MIT License
143 stars 18 forks source link

easy ways to speed up large parallel runs #46

Open ekg opened 4 years ago

ekg commented 4 years ago

There is still some low-hanging fruit.

This is from a merge of 25 assemblies of the same genome.

It seems that the overlap collection is now working well. But:

[seqwish::seqidx] 0.000 indexing sequences                                       
[seqwish::seqidx] 1117.704 index built                                           
[seqwish::alignments] 1117.704 processing alignments                             
[seqwish::alignments] 1632.284 indexing                                          
[seqwish::alignments] 7420.665 index built                                       
[seqwish::transclosure] 7420.690 computing transitive closures                   
[seqwish::transclosure] 7430.425 0.00% 0-100000000 overlap_collect               
[seqwish::transclosure] 7487.728 0.00% 0-100000000 overlaps_vector_merge         
[seqwish::transclosure] 7582.184 0.00% 0-100000000 rank_build                    
[seqwish::transclosure] 7765.759 0.00% 0-100000000 parallel_union_find           
[seqwish::transclosure] 7862.661 0.00% 0-100000000 dset_write                    
[seqwish::transclosure] 7890.029 0.00% 0-100000000 dset_compression              
[seqwish::transclosure] 7908.197 0.00% 0-100000000 dset_sort                     
[seqwish::transclosure] 7917.390 0.00% 0-100000000 dset_invert                   
[seqwish::transclosure] 7933.798 0.00% 0-100000000 graph_emission                
[seqwish::transclosure] 8967.516 3.44% 100000000-200517826 overlap_collect       
[seqwish::transclosure] 9040.887 3.44% 100000000-200517826 overlaps_vector_merge 
[seqwish::transclosure] 9140.550 3.44% 100000000-200517826 rank_build  
...