claczny / busybee_web

Repository for BusyBee Web - Web-based deconvolution of metagenomic data by bootstrapped supervised binning
2 stars 0 forks source link

Run "bootstrapped supervised binning"-model on unseen data #1

Open claczny opened 7 years ago

claczny commented 7 years ago

Bootstrapped supervised binning builds internally a classification model to accelerate the binning process. More specifically, only "cluster points", i.e., a subset of the input sequences, are used to train this model. The trained model is then used to predict a "bin"-assignment for the remaining data.

Currently, all this is performed on the same input, i.e., all the data must be available during the entire computation. However, it would be nice to simply apply the trained model on "unseen" data, i.e., data that was not provided as input (not even in the form of "non-cluster" points, e.g., border points or remaining points).