w3c / webvtt

WebVTT Standard
https://w3c.github.io/webvtt/
Other
104 stars 29 forks source link

WebVTT Subtitle Video "Multiple Annotations", Customized language, and Multiple Events #486

Closed njss closed 4 years ago

njss commented 4 years ago

I am deeply interested in using VTT subtitles to annotate videos from experiments. Some of the main functionalities needed are:

I did a small test, by creating a VTT player in HTML5 and capture the events, it seems to work, however, I would like to be sure of the best way to address this use case (for which I strongly believe that VTT would give a great contribution, for example, to develop a learning system that also can make use of human behaviour sensing in real-time)

Apologies if here is not the best place to ask these questions. Any comment will be highly appreciated!

Thank you,

Nelson

fsoder commented 4 years ago

Naming "the best way(TM)" to do something tends to be a tall order, but here are some suggestions/remarks. (I'm assuming an HTML environment here but I don't think it should be too difficult to translate to something else.)

It sounds like what you may have/want/need is a "metadata" track. The cue payload can essentially be anything, so if have a "DSL" for defining the events you can encode it and store it as the cue payload, leaving timing to the cue. That the payload is not a natural language doesn't really matter. How you want to encode actions is really up to you - one or multiple events/actions in one cue etc.

For the actual display you could dynamically create a track (addTextTrack) and populate it with (decoded/translated) cues on the fly based on the events (cues beginning/ending) from the metadata track. (Not sure I understood if you wanted real-time capture as well, but I suspect then you'd just get streams of events from different sources.) Issues that could arise here is things like event delivery latency, which depending on QoI could vary.

njss commented 4 years ago

@fsoder Thank you so much! This is exactly what I am looking for... Just trying not to reinvent the wheel...on something that is already implemented! These are good new directions.

Best, Nelson

gkatsev commented 4 years ago

Going to echo @fsoder here. kind metadata is what you're looking for. You can then set the mode to hidden so that you get events but not actually display it. Though, technically, you can do that with any track as well.

njss commented 4 years ago

@gkatsev Thank you so much. I will try it now!

njss commented 4 years ago

@gkatsev Yes...it works like a charm! Thank you.

gkatsev commented 4 years ago

Glad to hear you got it working!