TheFrenchGhosty / TheFrenchGhostys-Ultimate-YouTube-DL-Scripts-Collection

The ultimate collection of scripts for YouTube-DL.
GNU General Public License v3.0
2.32k stars 126 forks source link

Extreme throttling when pulling livechat data. #99

Closed ghost closed 1 year ago

ghost commented 1 year ago

When running the channel archive script the process seems to be throttled to extremes (100KiB/s) when pulling livechat data. Download rate is fine for the video itself. Using aria2c does not make a difference. Using VPNs or different residential IPs do not make a difference and I'm experiencing the same limit across different machines.

Is this a known issue? I'm aware Google throttles video to 1MiB/s but I found nothing about the livechat on yt-dlp's Git. If this does not seem to occur for others - could someone attempt to replicate it with this channel or the video below?

Small excerpt from the process:

[debug] Invoking youtube_live_chat downloader on "https://www.youtube.com/watch?v=E88-k0mNXfk&bpctr=9999999999&has_verified=1"
[youtube_live_chat] Downloading live chat
[youtube_live_chat] Total fragments: unknown (live)
[download] Destination: Kaela Kovalskia Ch. hololive-ID/Kaela Kovalskia Ch. hololive-ID - 20221121 - 【Pokemon Scarlet】#3 the end.【Kaela Kovalskia ⧸ hololiveID】/Kaela Kovalskia Ch. hololive-ID - 20221121 - 【Pokemon Scarlet】#3 the end.【Kaela Kovalskia ⧸ hololiveID】 [E88-k0mNXfk].live_chat.json

[download]  307.74MiB at     67.84B/s (00:00:15) (frag 0)
[download]  307.74MiB at    203.51B/s (00:00:15) (frag 0)
[download]  307.74MiB at    474.78B/s (00:00:15) (frag 0)
[download]  307.75MiB at   1016.56B/s (00:00:15) (frag 0)
[download]  307.77MiB at    2.05KiB/s (00:00:15) (frag 0)
[download]  307.80MiB at    4.16KiB/s (00:00:15) (frag 0)
[download]  307.86MiB at    8.38KiB/s (00:00:15) (frag 0)
[download]  307.98MiB at   16.80KiB/s (00:00:15) (frag 0)
[download]  308.23MiB at   33.43KiB/s (00:00:15) (frag 0)
[download]  308.73MiB at   66.36KiB/s (00:00:15) (frag 0)
[download]  308.89MiB at   76.42KiB/s (00:00:15) (frag 0)
[download]  308.89MiB at   76.42KiB/s (00:00:15) (frag 1)
BromTeque commented 1 year ago

Weird. It seems like the video was still live when you tried to download it. The 30-day buffer built into the channel archive script should prevent this from happening. Perhaps it is bugged with live streams?

Does the same issue appear if you retry now?

EDIT: As to why it's "throttled", it is probably downloading the live chat as fast as it's being sent/streamed/published.

ghost commented 1 year ago

Weird. It seems like the video was still live when you tried to download it. The 30-day buffer built into the channel archive script should prevent this from happening. Perhaps it is bugged with live streams?

It definitely was not live when I tried to download it. It was streamed on November 20th. Perhaps there is something wrong with how yt-dlp handles lives now (I know Youtube re-did the live ui/tab a couple weeks back)?

Does the same issue appear if you retry now?

Still happening yes.

EDIT: As to why it's "throttled", it is probably downloading the live chat as fast as it's being sent/streamed/published.

Seems plausible if yt-dlp thinks the stream is live.

BromTeque commented 1 year ago

Seems like yt-dlp is using the wrong live chat downloader/protocol based on the yt-dlp/extractor/youtube.py code found here. If I'm reading the code correctly should the youtube_live_chat_replay protocol be used when live videos are over. Your output log shows the youtube_live_chat protocol being used. Although I am not familiar with these downloaders/protocols, so I might be wrong. Make sure you're on the latest yt-dlp build. It might help.

I believe this to be a yt-dlp issue, not an archive script issue.

You're best bet is probably creating an issue over at yt-dlp.

ghost commented 1 year ago

Seems slow rates are expected and not a yt-dlp problem.

BromTeque commented 1 year ago

Seems slow rates are expected and not a yt-dlp problem.

Thank you for the update. I tried to find out if this is expected behavior on yt-dlp's part, but that is surprisingly difficult.

so I might be wrong.

I was indeed wrong. I apologize for the downloader/protocol terminology mix-up. I am not too familiar with yt-dlp's internal structure. It seems some of my earlier assumptions might be wrong as well. UX is difficult. Hehe.

It is still not an archive script issue. If you wish to omit downloading live chat, modify the script. Switching out --all-subs with --sub-langs all,-live_chat might work. However, I'll recommend reading the yt-dlp documentation yourself. Downloading the live chat separately might also be an idea.

ghost commented 1 year ago

Switching out --all-subs with --sub-langs all,-live_chat might work.

Yeah this is the option I resorted to. Much more acceptable rates now. Thanks for the help!