iv-org / invidious

Invidious is an alternative front-end to YouTube
https://invidious.io
GNU Affero General Public License v3.0
16.18k stars 1.79k forks source link

[Bug] Color coded captions #4716

Closed boher closed 4 months ago

boher commented 4 months ago

Describe the bug As of May 2024, only a few public instances e.g. invidious.incogniweb.net, inv.us.projectsegfau.lt were building WebVTT files with the styles captured:

WEBVTT
Kind: captions
Language: en
Style:
::cue(c.colorCCCCCC) { color: rgb(204,204,204);
 }
::cue(c.yellow) { color: yellow; }
##

00:00:00.540 --> 00:00:02.250
<i>Welcome back to our channel.</i>

00:00:02.250 --> 00:00:04.021
<c.yellow>We're gonna talk about...</c>

00:00:04.021 --> 00:00:06.371
<c.colorCCCCCC><i>(music playing)</i></c>

While the majority of public instances do not:

WEBVTT
Kind: captions
Language: en

00:00:00.540 --> 00:00:04.021
Welcome back to our channel. We're gonna talk about...

00:00:04.021 --> 00:00:06.371
(music playing)

Steps to Reproduce

  1. Go to video with color coded captions overridden by video creator
  2. Click to turn on captions provided by video creator
  3. In network tab, open the WebVTT file downloaded with captions API call

Logs No error logs for styling issue

Screenshots For the same video: Instance that work: image Instance that do not work, WebVTT file downloaded has no stylings: image

Possible fix I saw that #4414 was implemented to explicitly escape special characters, maybe something similar could be done to ensure styling is captured across all instances?

# Check if the text contains any styling markup
if text.includes?("</font>")
  # Convert HTML <font color="#CCCCCC"> to WebVTT <c.colorCCCCCC>
  text = text.gsub(/<font color="#(\w{6})">/, "<c.\\1>")
  text = text.gsub(/<\/font>/, "</c>")
end
unixfox commented 4 months ago

Please give the url where it works and where it doesn't

syeopite commented 4 months ago

This is pretty much impossible to implement on large public instances. Since YouTube's captions endpoint can be rate-limited, large instance owners typically enable a workaround in the configs that converts transcripts into captions instead. However, transcripts does not come with any styling information attached.