Chocobozzz / PeerTube

ActivityPub-federated video streaming platform using P2P directly in your web browser
https://joinpeertube.org/
GNU Affero General Public License v3.0
13.1k stars 1.51k forks source link

Failure importing YouTube that are still being post-processed #5372

Open emansom opened 2 years ago

emansom commented 2 years ago

Describe the current behavior

Currently when importing a YouTube video that exceeds 7+ hours in length, it'll exceed the max output buffer allocated due to large amounts of DASH+HLS segments in the JSON-LD output.

Steps to reproduce

  1. Setup PeerTube
  2. Import a YouTube livestream that finished post processing that exceeds seven hours of length.
  3. Observe the error log

Describe the expected behavior

PeerTube filtering out DASH+HLS segments by passing the --extractor-args "youtube:skip=dash,hls" argument to yt-dlp to reduce the size of the JSON-LD output considerably.

Additional information

error[22-10-2022 07:39:55] Cannot fetch information from import for URL https://www.youtube.com/watch?v=yZIZzk-Anvs channel De PoppenCast, skipping import

{
  "err": {
    "stack": "MaxBufferError: Command failed: /data/bin/yt-dlp --extractor-args youtubetab:approximate-date --dump-json -f bestvideo[vcodec!*=av01][vcodec!*=vp9.2]+bestaudio/best[vcodec!*=av01][vcodec!*=vp9.2]/bestvideo[ext=mp4]+bestaudio[ext=m4a]/best https://www.youtube.com/watch?v=yZIZzk-Anvs\nmaxBuffer exceeded\n{\"id\": \"yZIZzk-Anvs\", \"title\": \"Dutch Matrix Afterparty | Shit uitrijden\", \"formats\": 
    .. 10MB of DASH/HLS formats ...
emansom commented 2 years ago

Will be addressed by upcoming PR.

Chocobozzz commented 2 years ago

Hi,

It seems I can't reproduce this error with the video https://www.youtube.com/watch?v=yZIZzk-Anvs in current develop

emansom commented 2 years ago

Hi,

It seems I can't reproduce this error with the video https://www.youtube.com/watch?v=yZIZzk-Anvs in current develop

The bug was only present during a select time window in which a livestream of 7+ hours in length wasn't fully processed/transcoded on YouTube's servers.

Already have a solution that significantly reduces yt-dlp JSON-LD output to only the necessary data. Will be in one of my PRs once I cherry-pick and squash all the commits into separate logical pieces.