clamsproject / app-text-slicer

Other
0 stars 0 forks source link

adding a different running mode to grab text between two time frames #2

Open keighrim opened 4 days ago

keighrim commented 4 days ago

New Feature Summary

Currently, the app slices text based on the (start, end) pairs found in time frame annotations. Instead, if we can use the app for segmentation of videos between two occurrences of a type of time frame, that'd be useful feature to use to, for example, get a segmentation between two "title cards" from SWT or something like that.

Implementation-wise, it'd be something like

# in addition to doing of this
if runmode == REGULAR: 
    for tf in get_timeframes():
        slice(full_text, tf.get('start'), tf.get('end')

# add this 
elif runmode == BUMPER_TO_BUMPER:
    tfs = list(get_timeframes())
    for tf1, tf2 in zip(tfs[:-1], tfs[1:]):
        slice(full_text, tf1.get('start'), tf2.get('start')

Related

No response

Alternatives

No response

Additional context

No response

keighrim commented 4 days ago

Follow up question, which TimeFrame should we aligned the new document to?

bohJiang12 commented 4 days ago

Follow up question, which TimeFrame should we aligned the new document to?

Under the BUMPER_TO_BUMPER mode, can we do

bohJiang12 commented 4 days ago

I experimented this feature in the following steps:

  1. find all TimeFrames within SWT view
  2. Organize them as a dict like: {label: list(TimeFrame)}
  3. For each label, do the slicing of the suggested feature
  4. And I summarized the results according to the number of tokens between 2 consecutive TimeFrame having the same label as follows:
LABEL            NUM_OF_TOKENS
------           -----------------------
bars:            [1]
slate:           [1]
other_opening:   [307, 75, 183, 8888, 684, 11204]
chyron:          [2992, 2134, 4259, 1266, 6715, 1372, 126]
other_text:  [2800, 3320, 177, 200, 132, 3068, 4159, 1016, 1153, 1115, 533]
credits:     [0]
keighrim commented 2 days ago

Yeah, the numbers, in chyrons for example, look much more suitable for "summarization" type of downstream apps. For bars, slate, and credits, zero token is expected, and that single token documents for bars and slate are probably ASR errors..?

bohJiang12 commented 2 days ago

Yeah, the numbers, in chyrons for example, look much more suitable for "summarization" type of downstream apps. For bars, slate, and credits, zero token is expected, and that single token documents for bars and slate are probably ASR errors..?

That is simply "", because the value in the dict is a list of lists of tokens.

bohJiang12 commented 2 days ago

Also, are we going to set the app for user to choose the running mode? For example, for now, users can choose to slice between BUMPER_TO_BUMPER and REGULAR

keighrim commented 2 days ago

Yup, I think the mode picker can be exposed as a runtime parameter.

bohJiang12 commented 2 days ago

Follow up question, which TimeFrame should we aligned the new document to?

Now I realized this is little tricky. For example, since the slicer would use Tokens from whisper, so according to the time line in whisper view, any pair of TimeFrames with the same label would contain many several (or many) other types of TimeFrames, e.g. [chyron, slate, bars, bars, credits, ... , chyron].

One possible solution that wouldn't hurt downstream apps of text-slicer is: