bokulich-lab / RESCRIPt

REference Sequence annotation and CuRatIon Pipeline
BSD 3-Clause "New" or "Revised" License
89 stars 26 forks source link

ENH: refactor + parallelize the `evaluate-fit-classifier` and cross-validate pipelines #131

Open nbokulich opened 2 years ago

nbokulich commented 2 years ago

Some re-organization is in order. In particular, some steps could be parallelized so that fitting occurs simultaneously on different folds (for the CV action), instead of serially. The result is quite some inefficiency as this time-consuming step (that cannot be parallelized within a fold) occurs in serial.

An experimental vsearch LOO classifier could also be considered for re-addition in this context. It was removed here: https://github.com/bokulich-lab/RESCRIPt/pull/130