Refactor bnpy data clustering apps - Githubissues

ORNL-MDF / Myna

Framework to facilitate simulation workflows and experimental databases for additive manufacturing

BSD 3-Clause "New" or "Revised" License

5 stars 3 forks source link

Refactor bnpy data clustering apps #24

Open gknapp1 opened 2 weeks ago

gknapp1 commented 2 weeks ago

Currently the data clustering implementation the bnpy-based apps is functional, but can be slow to converge when too much training data is added. It can also be awkward to re-cluster after the model is updated. In order to have automatic region of interest selection in Peregrine, the clustering implementation should be sped up. Some ideas for enhancements:

use a representative sample for training instead of the entire dataset
instead of training and clustering for each part sequentially, create training data based on all parts first and then cluster
create a the Bnpy subclass of the base App class to streamline the apps themselves

gknapp1 commented 1 week ago

Removing this from the Peregrine demonstration milestone, because of the capability implemented in #25. Updating the bnpy app is still planned, but lower priority for the moment.