[Bug] Color coded captions

boher commented 4 months ago

Describe the bug As of May 2024, only a few public instances e.g. invidious.incogniweb.net, inv.us.projectsegfau.lt were building WebVTT files with the styles captured:

WEBVTT
Kind: captions
Language: en
Style:
::cue(c.colorCCCCCC) { color: rgb(204,204,204);
 }
::cue(c.yellow) { color: yellow; }
##

00:00:00.540 --> 00:00:02.250
<i>Welcome back to our channel.</i>

00:00:02.250 --> 00:00:04.021
<c.yellow>We're gonna talk about...</c>

00:00:04.021 --> 00:00:06.371
<c.colorCCCCCC><i>(music playing)</i></c>

While the majority of public instances do not:

WEBVTT
Kind: captions
Language: en

00:00:00.540 --> 00:00:04.021
Welcome back to our channel. We're gonna talk about...

00:00:04.021 --> 00:00:06.371
(music playing)

Steps to Reproduce

Go to video with color coded captions overridden by video creator
Click to turn on captions provided by video creator
In network tab, open the WebVTT file downloaded with captions API call

Logs No error logs for styling issue

Screenshots For the same video: Instance that work: Instance that do not work, WebVTT file downloaded has no stylings:

Possible fix I saw that #4414 was implemented to explicitly escape special characters, maybe something similar could be done to ensure styling is captured across all instances?

# Check if the text contains any styling markup
if text.includes?("</font>")
  # Convert HTML <font color="#CCCCCC"> to WebVTT <c.colorCCCCCC>
  text = text.gsub(/<font color="#(\w{6})">/, "<c.\\1>")
  text = text.gsub(/<\/font>/, "</c>")
end

unixfox commented 4 months ago

Please give the url where it works and where it doesn't

syeopite commented 4 months ago

This is pretty much impossible to implement on large public instances. Since YouTube's captions endpoint can be rate-limited, large instance owners typically enable a workaround in the configs that converts transcripts into captions instead. However, transcripts does not come with any styling information attached.

iv-org / invidious

[Bug] Color coded captions #4716