NERC-CEH / plankton_ml

A project for image processing and analysis pipelines for plankton sampling
GNU General Public License v3.0
0 stars 1 forks source link

Apply model explainability tools to the images output by similarity search #6

Closed metazool closed 2 months ago

metazool commented 4 months ago

Exploration of model explainability techniques using the prediction capabilities of the CEFAS model, in complement to using it as a source of embeddings.

E.g. we take the images resulting from a similarity search of the embeddings, make predictions with the original model and look at the visual features that influenced the predictions

SHAP / LIME are the ones I'm familiar with but there's a whole toolbox in the Captum API - suggestions of approaches that worked well during development of AMI-system would be appreciated, @albags !

metazool commented 4 months ago

A quick note on this as I may not make time to finish the branch, to the extent worth doing so, this week

Initial output was a lot more inconclusive than i'd hoped for. Could be a range of reasons including

  1. the plankton-cefas ResNet model is undercooked (is there any info about how it was trained?)
  2. its classification mode is a poor fit for our data (unsurprisingly)
  3. we're missing a normalisation step for the input and that's throwing things off (are there worked examples)

It's worth running the same attempted interpretations over a CEFAS plankton test set before drawing any conclusions. This seems not worth pursuing much more because using the scivision model for classification was never the intention, this was only to throw light on how and why it seems to work pretty well for feature extraction.

It's also worth going back a step, to extract and compare embeddings using different networks - using a generic ImageNet-type Resnet50 that's never specifically looked at plankton, and a default network as a sense check.

short video dataviz of occlusion output - most of the other methods i tried were even more garbled. we should expect to see much more consistency here

metazool commented 2 months ago

I was on the point of closing https://github.com/NERC-CEH/plankton_ml/pull/7 as

  1. The results were unhelpfully inconclusive (image size, model maturity, other?)
  2. Subsequent refactoring would now involve an overhaul of the code for work we don't particularly need

It's a useful line in the sand though. Low-priority but still actionable?

metazool commented 2 months ago

Closed this along with #7 - see comments there