clamsproject / app-swt-detection

CLAMS app for detecting scenes with text from video input
Apache License 2.0
1 stars 0 forks source link

"raw" labelset from the current SR annotation is overpopulated #102

Closed keighrim closed 4 months ago

keighrim commented 5 months ago

Bug Description

When we run the app, it returns classification results (on each timepoint) on 22 labels

(for example)

              "B": 1.2004544025501218e-08,
              "S": 0.9911466240882874,
              "S:H": 4.395981408750194e-12,
              "S:C": 7.746329619418013e-12,
              "S:D": 3.075464145504969e-12,
              "S:B": 1.7393599581472241e-12,
              "S:G": 8.484618389814624e-12,
              "W": 0.0018126036738976836,
              "L": 8.085626177489758e-06,
              "O": 0.004137764219194651,
              "M": 2.2342723241308704e-05,
              "I": 0.0005785876419395208,
              "N": 7.963440293679014e-05,
              "E": 5.8202545005769935e-06,
              "P": 1.7434373376090662e-06,
              "Y": 3.64386693263441e-07,
              "K": 4.580545919452561e-06,
              "G": 0.0019418805604800582,
              "T": 4.255012754583731e-05,
              "F": 8.692211395100458e-07,
              "C": 3.8110854802653193e-05,
              "R": 0.0001566969440318644,
              "NEG": 2.1870659111300483e-05

But the subtype-suffixed labels were never actually used when reading the annotations for training

https://github.com/clamsproject/app-swt-detection/blob/5925f029c6f13446a78e7144ae30f146354186a2/modeling/train.py#L136

(subtype labels are stored as labels['frames'][i]['lsubtype_abel'] in the preprocessed annotation metadata file)

But used in softmax space.

https://github.com/clamsproject/app-swt-detection/blob/5925f029c6f13446a78e7144ae30f146354186a2/modeling/train.py#L76-L80

and

https://github.com/clamsproject/app-swt-detection/blob/e6662c46aa19e4540eccf42c83ecbf07cf9397be/modeling/__init__.py#L5-L6

We need to fix the code so that subtypes are not used in softmax layer, or subtypes are used in training example. (In the latter case, we should instead take out S label)

Reproduction steps

Run SWT on a video, confirm the labelset in the TimePoint annotation objects are 22-way classification.

Expected behavior

No response

Log output

No response

Screenshots

No response

Additional context

(full label definition is in #1)

keighrim commented 5 months ago

shouldn't have been closed, since the PR didn't include new model with a smaller softmax dimension.