Currently the data clustering implementation the bnpy-based apps is functional, but can be slow to converge when too much training data is added. It can also be awkward to re-cluster after the model is updated. In order to have automatic region of interest selection in Peregrine, the clustering implementation should be sped up. Some ideas for enhancements:
use a representative sample for training instead of the entire dataset
instead of training and clustering for each part sequentially, create training data based on all parts first and then cluster
create a the Bnpy subclass of the base App class to streamline the apps themselves
Removing this from the Peregrine demonstration milestone, because of the capability implemented in #25. Updating the bnpy app is still planned, but lower priority for the moment.
Currently the data clustering implementation the bnpy-based apps is functional, but can be slow to converge when too much training data is added. It can also be awkward to re-cluster after the model is updated. In order to have automatic region of interest selection in Peregrine, the clustering implementation should be sped up. Some ideas for enhancements:
Bnpy
subclass of the baseApp
class to streamline the apps themselves