bio-miga / miga

MiGA: Microbial Genomes Atlas
Artistic License 2.0
31 stars 9 forks source link

medoids tree takes too long #130

Open lmrodriguezr opened 3 years ago

lmrodriguezr commented 3 years ago

The construction of medoid trees when externally querying a project takes too long because the entire AAI matrix is processed and filtered for medoids. A better approach (when number medoids << reference datasets) would be to use the individual SQLite databases.

The relevant code is located at utils/distance/pipeline.rb (#build_medoids_tree). It would be great to build a matrix processor directly in lib/miga instead, which would also facilitate a CLI utility to extract (sub-)matrices.