Open frisch1 opened 3 years ago
Hi @frisch1, I would definitely say that this is a feature request and not a bug. Sounds interesting, but I don't see myself implementing this anytime soon, as this module is mostly used for data-science purposes and I don't really see the use-case for livestreams. However, if you want to contribute this feature I'd be happy to merge it. Deserializing the response probably isn't a big deal, you just gotta find out how to scrape the URL you'll have to call to actually get that response. Let me know if you have that figured out and are interested in contributing it, so we can have a chat on how to implement this into the current API 😊
Hello. This is a feature request vs bug, methinks.
Have you looked at extracting captions from a live stream. If you look at any example (https://www.youtube.com/whitehouse) of a live stream, while the stream is live (key), there are auto-generated subtitles delivered in the videoplayback file that streams in, embedded e.g.
https://r6---sn-8xgp1vo-p5qy.googlevideo.com/videoplayback?expire=1614211486&ei=PpU2YM-ULYm98wTm0L_gDA&ip=71.246.232.10&id=yhxmnlGtJ-g.1&itag=386&source=yt_live_broadcast&requiressl=yes&mh=zc&mm=44,29&mn=sn-8xgp1vo-p5qy,sn-p5qs7nel&ms=lva,rdu&mv=m&mvi=6&pl=18&initcwndbps=1717500&vprv=1&live=1&hang=1&noclen=1&xtags=lang=en:ttkind=asr&mime=text/mp4&ns=aD6U7aY6idhNPyXEqiXu6K0F&gir=yes&mt=1614189620&fvip=6&keepalive=yes&fexp=23983797&beids=9466586&c=WEB&n=lmOMV3MuzrpzRQ&sparams=expire,ei,ip,id,itag,source,requiressl,vprv,live,hang,noclen,xtags,mime,ns,gir&sig=AOq0QJ8wRAIgd0qHHqBF3aRir-pw93UKhFNuFxrlpe6OqyMerxsZ4JsCIHZK74UbKX7ig08-egt6vMDzP6g_7EhOyuOOoUXAkSVW&lsparams=mh,mm,mn,ms,mv,mvi,pl,initcwndbps&lsig=AG3C_xAwRAIgHa9tABbFKMiVQSnLLWa7iO_iu7pcVtrea43G-zdfGBUCIGbqOL15uN0-32Yki8s5vwXD2XDkvCBUgntS54w9xvjc&alr=yes&cpn=LW2TAYe5jfbjzMjx&cver=2.20210223.09.00&sq=664
Expired, of course, but an example, the payload here is:
The timedtext is embedded in the file:
It's not TTMLv3 but we get this text is associated with sequence #664 from the URL. The t= appears to be millisecond designation relative to the sequence chunk, and "d" appears to be the duration. But even absent that, the stream of text is there. Note it doesn't appear by default. It appears you need to insert into the "
sparams
" in the URL "xtags
" to get the live captioning, but it appears if you try to insert it, it messes up the hash/key associated with it so it needs to be triggered on (cc_load_policy=1 in URL does NOT seem to work)youtube-dl et al don't recognize this since it's not being delivered as a standalone subtitle file. Acts like there's no subtitles on the live stream since it doesn't identify as a subtitles file.
Thoughts?