Refactor aggregate taxon model

State

Currently the cerebro.taxa section of the Cerebro data model is a HashMap<taxid, Taxon> where type taxid = String. This is a result of the aggregation function which uses sequential HashMaps to group taxa by their taxid.

Problem

HashMap is not able to be queried efficiently using MongoDB aggregation pipelines. Downstream applications eventually use a Vec<Taxon> particularly endpoints on the API.

Refactoring to Vec is necessary, but at this stage may affect a number of dependent subsystems.

esteinig / cerebro

Refactor aggregate taxon model #5

State

Problem