Currently the cerebro.taxa section of the Cerebro data model is a HashMap<taxid, Taxon> where type taxid = String. This is a result of the aggregation function which uses sequential HashMaps to group taxa by their taxid.
Problem
HashMap is not able to be queried efficiently using MongoDB aggregation pipelines. Downstream applications eventually use a Vec<Taxon> particularly endpoints on the API.
Refactoring to Vec is necessary, but at this stage may affect a number of dependent subsystems.
State
Currently the
cerebro.taxa
section of theCerebro
data model is aHashMap<taxid, Taxon>
wheretype taxid = String
. This is a result of the aggregation function which uses sequentialHashMaps
to group taxa by theirtaxid
.Problem
HashMap
is not able to be queried efficiently usingMongoDB
aggregation pipelines. Downstream applications eventually use aVec<Taxon>
particularly endpoints on the API.Refactoring to
Vec
is necessary, but at this stage may affect a number of dependent subsystems.