Closed keighrim closed 4 months ago
A few thoughts.
Once we update the stopgap classification solution to using dictionaries it is not so stopgap anymore, but the issue of it being outside of the vocabulary and therefore somewhat unsatisfactory remains.
Using
metadata.add_output(AnnotationTypes.TimeFrame, labelset=['slate', 'chyron', 'credits']
seems much better to me than adding an output for each label, which gets ugly pretty quickly.
Classifications are not just relevant to Regions, they can be relevant to Relations as well, so we should probably have this defined on Annotation.
LAPPS never did define anything for classification scores, I vaguely remember we shoved in confidence scores as needed. Having said that, while I am not wild about the idea of copying LAPPS types into CLAMS, this does seem like a reason to consider that.
We do have an option to create a Classification type as a sibling to Region and Relation.
Start a PR to implement the proposal. test deployment of the vocab 1.0.2 is available at http://eldrad.cs-i.brandeis.edu:4000/1.0.2/vocabulary/ (VPN required)
The HEAD of the PR (and the test deployment on eldrad
) is updated based on our discussion yesterday.
Some example screenshot;
@marcverhagen let me know what you think.
New Feature Summary
As we first use
TimePoint
to encode image-level classification (https://github.com/clamsproject/app-swt-detection/issues/41), we had to come up with somewhat arbitrary stopgap solution. This thread is to discuss formal adoption of such representation into vocab definitions for larger scale applicability.Proposal
We add the following properties to
Region
(super-)typelabelset
: defines the label or tag set.classifications
: record the classification results, as "map from string to real", where keys are labels and values are the scoreslabel
(ortag
): top-1 label from classification resultsAll the subtype annotations that are "anchors" to some parts of the raw source data can be used for target of a classification task. SWT is doing image classification, but an app can run object classification on
BoundingBox
es, or POS tag classification onToken
s. So I think it's natural expansion from SWT to general regional annotations.Example
At the moment, in SWT app (v2 and v3), we can use "one-of" specification to iterate all possible labels for
TimeFrame
annotations. For example;https://github.com/clamsproject/app-swt-detection/blob/61aba225c1502af1ebf6053fc1b86d6d45a2db82/metadata.py#L29-L35
Instead, with the proposed properties, we can do
Related
Alternatives
No response
Additional context
No response