Object classifier training idea: sample "representative" objects from clusters

haesleinhuepf / napari-accelerated-pixel-and-object-classification

GPU-accelerated, OpenCL-based Random Forest Classifiers for pixel and labeled object classification in napari.

BSD 3-Clause "New" or "Revised" License

43 stars 7 forks source link

Object classifier training idea: sample "representative" objects from clusters #10

Closed kevinyamauchi closed 2 years ago

kevinyamauchi commented 2 years ago

Hello @haesleinhuepf ! What do you think about adding a training mode for the object classifier where the user is show some representative objects from clusters computed from the features that will be used for classification. My thought is this could be a way to help the user choose which objects to label. Something like the following:

cluster objects in the features space used for classification
For each of the M cluster, select N objects (where N is a user-settable parameter) to be labeled
Show the user the M * N objects and have them label each one
Train the classifier with the labels added in (3)

In terms of implementation, an initial PR could introduce (2) and (3) (i.e., leave the clustering to be done separately). The cluster membership would have to be a column in the feature table (e.g., a column of integers called "cluster" and the value indicates cluster membership).

What do you think? I would be happy to make a PR if you think it would be useful.

haesleinhuepf commented 2 years ago

Hi @kevinyamauchi ,

it sounds very interesting and in line with the work we do between this plugin and the napari-clusters-plotter. Before you send a PR with a front-end here, I'm wondering if it may make sense to add some functionality to the back-end first and a notebook demonstrating the functionality you're aiming at. Also this issue appears to be related to your suggestion:

https://github.com/haesleinhuepf/apoc/issues/8

Also just a note, I'm aiming at having at least the prediction here to also work on ImageJ side. It's not a strong-must, but at least something to consider.

haesleinhuepf commented 2 years ago

leave the clustering to be done separately). The cluster membership would have to be a column in the feature table (e.g., a column of integers called "cluster" and the value indicates cluster membership).

One more thing: Clustering exists already in the napari-clusters plotter, we just use the column name <ALGORITHM_NAME>_CLUSTERING_ID, and you can have multiple of those. Just in case you're already developing something elsewhere, I'd be happy to see our stuff reused. And we're happy about feedback. ;-)

CC @lazigu @Cryaaa

kevinyamauchi commented 2 years ago

Thanks for the quick reply @haesleinhuepf ! I am a big fan of napari-clusters-plotter and I have definitely been using it (thanks, @lazigu and @cryaaa!).

I will continue the discussion to prediction/training from tables over to https://github.com/haesleinhuepf/apoc/issues/8. I understand the desire to maintain cross-compatibility with ImageJ. That may end up being too restrictiv for my use case, but we can discuss on the apoc repo.

haesleinhuepf commented 2 years ago

Done in #12 Thanks for the inspiration and support @kevinyamauchi !