biigle / largo

:m: BIIGLE module to review image annotations in a regular grid
GNU General Public License v3.0
0 stars 1 forks source link

Outlier detection #88

Closed mzur closed 8 months ago

mzur commented 3 years ago

Use the mechanism implemented in biigle/core#336 to sort Largo patches by "similarity". Depending on how fast we can implement the sorting method, this can be done on demand with a click on a button (and the user has to wait a few seconds for the sorting). The sorting cannot be computed a priori because the annotations can constantly change.

mzur commented 3 years ago

The mechanism proposed in biigle/core#336 can't be easily transferred to Largo. Instead, we should find an algorithm to quickly detect and highlight outliers for a given label selected in Largo. These outliers can be shown first in the grid so they can be quickly selected and dismissed. There is no need for a general "similarity sorting".

mzur commented 3 years ago

Idea (no clue if it could work): Construct a "prototype" image hash (e.g. the mean bit vector of all hashes) and then compute the distance between the hash of each Largo patch and the prototype hash. The patches with the largest distance are most likely to be outliers. This also heavily depends on the image hashing algorithm that is used.

tschoeni commented 3 years ago

With Hash = Feature Vector you could also use (an) MPEG7 descriptor(s). See also: https://marine-imaging.com/fair/ifdos/iFDO-content/

mzur commented 3 years ago

Thanks for pointing that out. We've experimented with image hashes that are used for image retrieval (i.e. "find the images most similar to a given image"). Any method should work that allows you to compute a distance between hashes/feature vectors.

mzur commented 11 months ago

The outlier detection will be implemented in the same way than https://github.com/biigle/maia/pull/128 in MAIA, only the sorting is reversed in the "dismiss" step (i.e. the most dissimilar patches are shown first). In the "relabel" step, the sorting is regular (i.e. the most similar patches are shown first). Later when https://github.com/biigle/largo/issues/97 is implemented, users can also choose to "reverse" the sorting again.