inaturalist / vision-camera-plugin-inatvision

react-native-vision-camera frame processor plugin
MIT License
5 stars 2 forks source link

Non-human exclusion #5

Open albullington opened 3 years ago

albullington commented 3 years ago

Is your feature request related to a problem? Please describe. The AR camera sometimes identifies humans as non-humans, which often isn't a positive experience for the end user.

Describe the solution you'd like Tackle this somewhat differently from how we tackled it on the web, since RN-inat-camera doesn't have a concept of a common ancestor.

Ideally, a solution would also address the concept of uncertainty similar to the common ancestor approach in iNatAPI, where if the distribution of scores among top results is relatively even & human is in the mix, we assume the result might be human and return nothing.

Describe alternatives you've considered None

Additional context The original non-human exclusion for iNatAPI is in this commit. Since React Native only ever receives a single branch, non-human exclusion needs to be tackled in native code.

kueda commented 3 years ago

The online server results should already be doing this (that's what the PR in inatAPI was all about), so in theory you should only have to worry about offline. One potential challenge is that the app will need to either hard-code the ID of genus Homo or look it up in the taxonomy file. The latter might be better b/c it might change from model to model.

albullington commented 3 years ago

Making sure I'm understanding this correctly... does the first checkbox apply to Seek? We're receiving offline predictions in the form of a single taxonomic branch rather than comparing across branches the way we do for online vision, so any result we get back including humans should look something like:

Seek would either show Human (if the score is above the 0.7 threshold), show an ancestor like Animalia, or show unknown. Should we avoid showing any ancestors if the genus Homo is in the predictions but below the 0.7 threshold?

kueda commented 3 years ago

For the record, I don't think react-native-inat-camera can mimic what we're doing in the API since the approach to classifying an image is so different (rnic returns a "best" branch by starting at the root of the tree and always choosing the child with the highest score, while inatapi returns top results from anywhere in the tree as well as a "common ancestor" taxon that contains the highest-scoring results).

So I guess at this point this is more a request for proposals for how rnic might try to achieve the same goal of returning only human or nothing when the photo might be of a human. When human is the top-scoring result I think it will currently work fine: we'll just return a branch that contains human (maybe the client will want to exclude higher ranks in that case). However, when human is a close second or close third or something, we should probably return nothing. Some ideas for handling that:

  1. If human is among the top X results and its score is Y less than the next-best result, show no results. FWIW, we tried this in the API and it wasn't great, b/c pics where a non-human was the subject but a human was also in the photo often resulted in situations like this, so we'd end up showing no suggestions for common situations like a person holding a fish they just caught or a hand holding a flower.
  2. Calculate the scores for the branch ending in human and compare that with the “best” branch, and if the leaf of the best branch has a score within a X of the equivalent rank on the human branch, return nothing. Has similar problems to the first idea, though maybe a bit different.
  3. Return the human branch as well as the best branch and let the client decide what to show. Kind of a punt, but maybe provides more flexibility to the client.