classtranscribe / FrontEnd

The React + Redux Frontend for ClassTranscribe
https://classtranscribe.illinois.edu
Other
25 stars 28 forks source link

Captioning mismatch when videos have no dialogue #260

Open harsh183 opened 3 years ago

harsh183 commented 3 years ago

What happens is when the video has a long silent period (like an intro) the captions start playing but the actual dialogue starts later. For about ~30-40s the captions are correct and ahead but at some point it does correct itself. For example in STAT 420 lecture uuid=5cf8d03c-88bb-4e32-95fb-075c90265bf4 where the speaking starts at around 8 seconds but the captions start playing earlier.

image

I think I've also seen it in the middle of videos where there is a silence and the rest of the captioning gets an offset from then on as well.


I'm not sure if this has been brought up before, but I on quick glance of the issues on FrontEnd and WebAPI I couldn't find an issue related to this. I've reported to professors I've had issues with synchronization but I'm not sure where that ended up. I've noticed this in several courses over the year (CS 241, 357, 418, STAT 420) and was one of the main reasons I didn't use classtranscribe for the longest time.

angrave commented 3 years ago

This is likely a limitation of the automated captioning service but will need further research to confirm. The new functionality to edit captioning times (not yet in production) will allow course staff to at least manually edit this.

Other examples of videos that are affected: When the video starts with a few second musical intro.

harsh183 commented 3 years ago

Is the automatic service used open source or proprietary, and if so can it be raised with them as well?

will allow course staff to at least manually edit this

Can this be automatic though?

angrave commented 3 years ago

Ideally yes it would be automatic. More research is needed to understand the problem

angrave commented 3 years ago

We're using Azure. The service is not a generic Speech to Text service - not a captioning service. We would need to isolate the problem, create a small example, understand exactly what is happening and if it is relevant to the Speech to Text service. i.e. a deeper understanding of all of this code under theese conditions - https://github.com/classtranscribe/WebAPI/tree/staging/CTCommons/MSTranscription see https://github.com/classtranscribe/WebAPI/blob/221ec2376114d8cc848168a0e0d4646fb4f7ce46/CTCommons/MSTranscription/MSTranscriptionService.cs#L54 note the code is complex - it includes the ability to restart captions if the Azure service timed out half way through.

Maybe it is this line - https://github.com/classtranscribe/WebAPI/blob/221ec2376114d8cc848168a0e0d4646fb4f7ce46/CTCommons/MSTranscription/MSTranscriptionService.cs#L157

harsh183 commented 3 years ago

In terms of examples I think the STAT 420 videos will all do. For gaps in the middle I think I remember some 418 lectures had it so I'll link an id if I come across it again.