Ffmpeg does not force key frames as expected

m1k1o / go-transcode

On-demand transcoding origin server for live inputs and static files in Go using ffmpeg. Also with NVIDIA GPU hardware acceleration.

Apache License 2.0

208 stars 38 forks source link

Ffmpeg does not force key frames as expected #38

Closed m1k1o closed 1 year ago

m1k1o commented 1 year ago

Given we have a test video and want to force keframes at 1772,1776 timestamps:

ffmpeg -loglevel warning \
    -ss 1768 \
    -i test.mp4 \
    -force_key_frames 1772,1776 \
    -to 1776 \
    -copyts \
  -c:v libx264 \
    -preset faster \
    -profile:v high \
    -level:v 4.0 \
    -b:v 2800k \
  -c:a aac \
    -b:a 192k \
  -f mpegts - \
| ffprobe -loglevel error -skip_frame nokey -select_streams v:0 -show_entries frame=pkt_pts_time -of csv=print_section=0 -

Response shows us:

1769.416256
1773.420256
1777.382544

What is not what we expected. We expected to see 1772 and 1776 in the output.

m1k1o commented 1 year ago

From https://man.archlinux.org/man/extra/ffmpeg/ffmpeg-all.1.en#time

If the argument consists of timestamps, ffmpeg will round the specified times to the nearest output timestamp as per the encoder time base and force a keyframe at the first frame having timestamp equal or greater than the computed timestamp

m1k1o commented 1 year ago

Current philisophy is, that we separate whole recording to discrete segments, and everytime that segment is requested we transcode it. Segments must be created at keyframes, but ffmpeg is rounding specified times and sometimes splitting them unexpectedly.

We could fix this by allowing ffmpeg to have some sort of acepted offset where we account for the rounding.
Another possibility is to give up seeking accuracy and start realtime encoding from the seek position. But this would mean, segments cannot be reused and cached, for every client watching VOD there must be custom process.

pulsejet commented 1 year ago

I spent a lot of time trying to fix this before giving up to create go-vod (it doesn't create separate ffmpeg processes, instead just sends a SIGSTOP to pause transcoding). The problem is this:

Say all segments are 3s in length, and I've a segment starting at t=6s. Then the segment might end at t=9s. When we try to create the next segment starting at t=9s, then ffmpeg starts it at some keyframe say t=9.1s. That means 100ms of video disappeared, which leads to weird playback bugs. It might be even worse -- the keyframe at which it ends may be before 9s. And there's nothing we can do about the approximations.

m1k1o commented 1 year ago

@pulsejet but that is optimized only for a single user, right? Correct me if I am worong, but the segments are not predictable; depending on where you started the playback their timing might be different. Therefore they cannot be cached and reused when two client would start watching at different time (example: second would be 5 min ahead of first one).

In that case, you are sending SIGSTOP to stop transcoding too many segments, right? In that case maybe adding -re without SIGSTOP would yield the same results, but the CPU load would be distributed and not in peaks (only when transcoding).

pulsejet commented 1 year ago

@pulsejet but that is optimized only for a single user, right? Correct me if I am worong, but the segments are not predictable; depending on where you started the playback their timing might be different. Therefore they cannot be cached and reused when two client would start watching at different time (example: second would be 5 min ahead of first one).

Correct, there cannot be any caching since the start time must align if the user skips ahead. We can cache for the case where users don't skip, though, which might be the case for certain types of content.

In that case, you are sending SIGSTOP to stop transcoding too many segments, right? In that case maybe adding -re without SIGSTOP would yield the same results, but the CPU load would be distributed and not in peaks (only when transcoding).

The browser buffers ahead, so I'm not sure if/how -re would allow that.

m1k1o commented 1 year ago

Ignoring unexpected segments (https://github.com/m1k1o/go-transcode/commit/e3c42330ad6c06d4cc7c6542e409cf445ea05498) from ffmpeg seem to fix this issue. Those segments were typically very short in comparsion to expected segment length (200ms vs 4sec) and therefore significant amount of video playback was lost. By ignoring them we might still loose some slice of the media, because last segment is shorter than expected. But on the other side, the same rounding that was applied to last segment could be applied to the first segment in the the next trancsoding job and therefore compensate for the loss.