iv-org / invidious

Invidious is an alternative front-end to YouTube
https://invidious.io
GNU Affero General Public License v3.0
16.36k stars 1.83k forks source link

[Feature request] Add support for transcripts #2564

Open SamantazFox opened 2 years ago

SamantazFox commented 2 years ago

Add support for transcripts

image

They're accessed via the following InnerTube endpoint: https://www.youtube.com/youtubei/v1/get_transcript?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8 And requested via "params": <protobuf encoded video-id>


Notes:

Issue opened on the behalf of @syeopite, and thanks to @TiA4f8R for the screenshot :)

MinePlayersPE commented 2 years ago

For replacing captions, note that there wouldn't be any styling info (positioning, color, etc) with the transcriptions (also some videos like 0hEvBW2NFQU will give spammy transcripts with mismatched timestamps but this also happens on normal VTT captions) semi-PoC for transcript => vtt: https://gist.github.com/MinePlayersPE/f645f15d477be694748df721492d8a38

syeopite commented 1 year ago

For supporting transcripts there's videojs-transcript though it doesn't appear to be maintained.

nidhoggr-nil commented 9 months ago

I imagine this would help both for people with disabilities and also for tools which use the transcript as a source for summarization.

syeopite commented 9 months ago

Yep transcripts are great for that.


Since the issue opened in 2021, the logic for requesting transcripts has been added to the code to be used as a workaround for captions on larger instances.

Now all that's left to do is to add a proper API endpoint for video transcriptions, and a UI component.

callum-gander commented 7 months ago

Any movement on this since Jan?

syeopite commented 4 months ago

I've finished implementing a basic transcripts feature into Invidious.

transcript implementation

You can find the code over at my branch here https://github.com/syeopite/invidious/tree/transcripts-support

The feature is pretty much done but I'd like to work on a couple more QOL improvements, particularly to people with JS enabled, before sending a PR upstream.