The construction of medoid trees when externally querying a project takes too long because the entire AAI matrix is processed and filtered for medoids. A better approach (when number medoids << reference datasets) would be to use the individual SQLite databases.
The relevant code is located at utils/distance/pipeline.rb (#build_medoids_tree). It would be great to build a matrix processor directly in lib/miga instead, which would also facilitate a CLI utility to extract (sub-)matrices.
The construction of medoid trees when externally querying a project takes too long because the entire AAI matrix is processed and filtered for medoids. A better approach (when number medoids << reference datasets) would be to use the individual SQLite databases.
The relevant code is located at
utils/distance/pipeline.rb
(#build_medoids_tree
). It would be great to build amatrix
processor directly inlib/miga
instead, which would also facilitate a CLI utility to extract (sub-)matrices.