should we deal with compression failure?

rgaudin commented 4 years ago

A crashcourse run failed during the compression of video k7dTDjRnBqU.

ffmpeg failed to recompress a video and because we're expecting this to succeed, the task failed.

Too many packets buffered for output stream 0:1.
[libvorbis @ 0x5632c3166e40] 32 frames left in the queue on closing
Conversion failed!

There is a quite old ffmpeg bug mentioning this, explaining the reason, why it can't be fixed easily and a workaround for most (but not all) cases: setting -max_muxing_queue_size to a larger number than default (128).

One comment mentions getting around that bug by increasing the allocated memory for its GPU. In our context (docker), we don't have GPU but we do set hard limits on RAM.

Given the fact that I couldn't reproduce it on the same file, using the same image, chances are that we're experiencing this due to a RAM issue. RAM will be increased on the recipe and re-run to check.

reproducing steps:

youtube-dl -o "video.%(ext)s" -f "best[ext=webm]/bestvideo[ext=webm]+bestaudio[ext=webm]/best" k7dTDjRnBqU
docker run -v $(pwd):/data:rw openzim/youtube:2.1.3 ffmpeg -y -i file:/data/video.webm -codec:v libvpx -quality best -cpu-used 0 -b:v 300k -qmin 30 -qmax 42 -maxrate 300k -bufsize 1000k -threads 8 -vf scale='480:trunc(ow/a/2)*2' -codec:a libvorbis -b:a 128k file:/data/video.tmp.webm

Question: Beside the potential fix for this specific run (resources), should we consider not crashing the scraper on video conversion failures or are we happy with this behavior?

I'm in favor of failing so we get a chance to understand what went wrong and fix the cause but that is dependent on the number of occurrence of conversion failure (it's the first)

kelson42 commented 4 years ago

The error seems systematic with this recipe: https://farm.openzim.org/pipeline/5e7503fd41ac227a378955ea

kelson42 commented 4 years ago

@rgaudin Does the problem occurs with avconv?

satyamtg commented 4 years ago

@kelson42 @rgaudin I think I'll look into this one next. In the first look, it seems to be a RAM issue. Also, was the limit extended and tested?

kelson42 commented 4 years ago

@satyamtg What let you believe that 6GB are not enough memory? You see how much memory has been configured in the Docker container by looking to the scrape details https://farm.openzim.org/pipeline/5e7503fd41ac227a378955ea (click on the "Document" link).

rgaudin commented 4 years ago

I've just retried using the same docker resources limits as in the zimfarm:

youtube-dl -o "video.%(ext)s" -f "best[ext=webm]/bestvideo[ext=webm]+bestaudio[ext=webm]/best" k7dTDjRnBqU
docker run -v $(pwd):/data:rw --cpu-shares 3072 --memory-swappiness 0 --memory 6442450944 openzim/youtube:2.1.3 ffmpeg -y -i file:/data/video.webm -codec:v libvpx -quality best -cpu-used 0 -b:v 300k -qmin 30 -qmax 42 -maxrate 300k -bufsize 1000k -threads 8 -vf scale='480:trunc(ow/a/2)*2' -codec:a libvorbis -b:a 128k file:/data/video.tmp.webm

and it worked properly.

rgaudin commented 4 years ago

I've just retried using the same docker resources limits as in the zimfarm:

youtube-dl -o "video.%(ext)s" -f "best[ext=webm]/bestvideo[ext=webm]+bestaudio[ext=webm]/best" k7dTDjRnBqU
docker run -v $(pwd):/data:rw --cpu-shares 3072 --memory-swappiness 0 --memory 6442450944 openzim/youtube:2.1.3 ffmpeg -y -i file:/data/video.webm -codec:v libvpx -quality best -cpu-used 0 -b:v 300k -qmin 30 -qmax 42 -maxrate 300k -bufsize 1000k -threads 8 -vf scale='480:trunc(ow/a/2)*2' -codec:a libvorbis -b:a 128k file:/data/video.tmp.webm

and it worked properly.

rgaudin commented 4 years ago

So with this, I'm confident (actually tested) that -max_muxing_queue_size fixes the problem. I'll set it to a high number hoping we don't run into this again (we might).

openzim / youtube

should we deal with compression failure? #75