responsible-ai-collaborative / aiid

The AI Incident Database seeks to identify, define, and catalog artificial intelligence incidents.
https://incidentdatabase.ai
Other
168 stars 35 forks source link

Label recommendation for GMF annotation #2766

Open npit opened 4 months ago

npit commented 4 months ago

I think that a helpful feature for speeding up GMF annotation would be to add a recommendation utility, simulating something like the proposed workflow in the first visualization the description page: that is, the recommendation of likely labels for a partially annotated incident, as a function of historical incidents already annotated with a subset of the labels already applied.

The approach could use a similarity (e.g. Jaccard?) to map labels to sorted recommendations. For example, given applied goal G, show most likely methods; given applied method M, fetch most likely failures, etc. To improve latency, these could be computed on fixed periodic intervals rather than on the fly. Ideally, the keys of these mappings would be labelsets of existing annotations, rather than individual labels.

The resulting top-k similarity-thresholded recommendations (e.g. methods given goal G) could then be autofilled, displayed when hovering over the corresponding applied goal label, be highlighted and occupy the first k elements in the dropdown under the methods annotation box, etc.

If you think something like this can be integrated in the annotation UI, I can work and evaluate the recommendation approach offline prior to subsequent steps.

npit commented 8 hours ago

Giving this a bump to suggest a simplified version. There are some GMF goals which pretty much determine 1-1 correspondence for methods & technologies. For example, all "Autonomous Driving" goals will contain a handful of technologies 99% of the time (image segmentation, visual object detection, etc.). Similarly for "Chatbot", "AI Voice Assistant", and a few others.

Some baby steps towards making a recommender like the one outlined in the original post, would be to set up a barebones system that works with a predefined mapping.

For example, given a mapping json:

[
  "goal1": {
      "methods": {"known": ["method 1", ...], "potential": ["method2", ...]},
      "failures": {...}
      },
  "goal2": {
  ...
  }
]

Then when the annotator fills in goal1 for an incident, the "recommender" autofills any matching methods / failures in the annotation form, as specified in the mapping. Extra helpful if it transfers the goal snippet(s) to the autofilled labels, since the goal snippet is useful for subsequent categories, in some incident cases.