general properties for all `Annotation` subtypes in vocab to represent classification tasks and results

keighrim commented 5 months ago

New Feature Summary

As we first use TimePoint to encode image-level classification (https://github.com/clamsproject/app-swt-detection/issues/41), we had to come up with somewhat arbitrary stopgap solution. This thread is to discuss formal adoption of such representation into vocab definitions for larger scale applicability.

Proposal

We add the following properties to Region (super-)type

labelset: defines the label or tag set.
- as "string": a URI to externally defined set
- as "list of strings": simple list of all possible label values
- as "map from string to string": keys are possible labels, values are their descriptions
classifications: record the classification results, as "map from string to real", where keys are labels and values are the scores
label (or tag): top-1 label from classification results
- as "string": just label
- as a singleton "map from string to real": the top-1 label and its score

All the subtype annotations that are "anchors" to some parts of the raw source data can be used for target of a classification task. SWT is doing image classification, but an app can run object classification on BoundingBoxes, or POS tag classification on Tokens. So I think it's natural expansion from SWT to general regional annotations.

[!NOTE] At the moment, lapps vocal items are not officially in CLAMS vocab (https://github.com/clamsproject/mmif/issues/202), so we can't apply this to text spans. And it feels a bit unfortunate to me.

Example

At the moment, in SWT app (v2 and v3), we can use "one-of" specification to iterate all possible labels for TimeFrame annotations. For example;

https://github.com/clamsproject/app-swt-detection/blob/61aba225c1502af1ebf6053fc1b86d6d45a2db82/metadata.py#L29-L35

Instead, with the proposed properties, we can do

# in metadata.py
...
 metadata.add_input(DocumentTypes.VideoDocument, required=True) 
 metadata.add_output(AnnotationTypes.TimeFrame, labelset=['slate', 'chyron', 'credits'], timeUnit='ms')
 # we don't specify `classification` and `label` properties here because their values are not fixed
...

# then in the app.py 
...
class SWTApp(ClamsApp):
    ...
    def _annotate(...):
        ...
        v = mmif.new_view()
        # use `contains` metadata to "factor out" the common `labelset` properties
        v.new_conatin("TimeFrame", labelset='slate chyron credits'.split(), timeUnit='ms', document=a_doc_id)
        ...

Alternatives

No response

Additional context

No response

marcverhagen commented 5 months ago

A few thoughts.

Once we update the stopgap classification solution to using dictionaries it is not so stopgap anymore, but the issue of it being outside of the vocabulary and therefore somewhat unsatisfactory remains.

Using

metadata.add_output(AnnotationTypes.TimeFrame, labelset=['slate', 'chyron', 'credits']

seems much better to me than adding an output for each label, which gets ugly pretty quickly.

Classifications are not just relevant to Regions, they can be relevant to Relations as well, so we should probably have this defined on Annotation.

LAPPS never did define anything for classification scores, I vaguely remember we shoved in confidence scores as needed. Having said that, while I am not wild about the idea of copying LAPPS types into CLAMS, this does seem like a reason to consider that.

We do have an option to create a Classification type as a sibling to Region and Relation.

keighrim commented 4 months ago

Start a PR to implement the proposal. test deployment of the vocab 1.0.2 is available at http://eldrad.cs-i.brandeis.edu:4000/1.0.2/vocabulary/ (VPN required)

keighrim commented 4 months ago

The HEAD of the PR (and the test deployment on eldrad) is updated based on our discussion yesterday. Some example screenshot;

@marcverhagen let me know what you think.

clamsproject / mmif