zoriya / Kyoo

A portable and vast media library solution.
https://kyoo.zoriya.dev
GNU General Public License v3.0
1.78k stars 43 forks source link

Switch to -f hls instead of -f segment #592

Open zoriya opened 3 months ago

zoriya commented 3 months ago

Another step needed for #542

zoriya commented 3 months ago

-hls_time doc said Segment will be cut on the next key frame after this time has passed. but that's a lie. It tries to keep an "average" segment length (source). For -hls_time 5, I got:

0-segment-0.ts - 5985
0-segment-1.ts - 3983
0-segment-2.ts - 5985
0-segment-3.ts - 3983
0-segment-4.ts - 5985
0-segment-5.ts - 3983

Not sure if this is deterministic, I don't know if switching to -f hls is possible... There is no other ways of specifying when to split segments (looking at https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/hlsenc.c#L3147).

zoriya commented 3 months ago

From https://x.com/JeWe37/status/1822774329152508398?s=19

it splits at the next keyframe after hls_time * [the number of segments written so far] +/- an initial time that i dont know of

-hls_time seems like a no go

JeWe37 commented 3 months ago

Possible solution I came across while reading up on something related: MP4Box should probably be able to do the required muxing to HLS, with the -bound option and forcing keyframes as you already are. The idea is then that you can make the duration so small that MP4Box will split at every keyframe, because it's desperately trying to trying to fragment more finely than it actually can.

Hint from my own experience that took me a good while to figure out: Piping from ffmpeg to MP4Box is not entirely trivial. The only way I've found to do it is to use a named pipe, pipe the output from ffmpeg into that and read from it in MP4Box via pipe://PATH. Example:

mkfifo test.mp4
ffmpeg -to 3:00 -i yourfile.mp4 -force_key_frames "expr:gte(t,n_forced*2)" -sc_threshold 0 -s 640x480 -c:v libx264 -b:v 1536k -an -f mpegts - >> ../test.mp4
MP4Box -dash 1 -segment-name 'seg_$Number$' pipe://../test.mp4 -out live.m3u8:dual:cmfc

The playlist only gets written at the end, but since you aren't using it and only need correct segments anyway that shouldn't be a concern.

Not entirely sure about the latency implications of this approach.

JeWe37 commented 3 months ago

Scratch MP4Box, this works with ffmpeg too?

ffmpeg -to 3:00 -i yourfile.mp4 -force_key_frames "expr:gte(t,n_forced*2)" -sc_threshold 0 -s 640x480 -c:v libx264 -b:v 1536k -an -hls_time 0.1 -hls_playlist_type vod -hls_segment_type fmp4 -hls_segment_filename "fileSequence%d.mp4" prog_index.m3u8

Cleanly makes 90 segments. I guess the only problem is that the transmuxed stream may have too many keyframes? I suppose one could use the normal hls_time for it, then figure out the keyframes on which you actually split, and only force those keyframes, using the trick with the small hls_time for the other streams?

EDIT: Given you said on twitter:

-f segments is unreliable with splits containing more than 1 keyframe. sometimes it just cuts there without caring about the flsgs

is the issue being able to not split on keyframes? If so, couldn't you just encode with encoder flags (for me -x264-params keyint=1000:no-scenecut=1 worked, keyint must be larger than the segment duration in frames) that ensure that you get only manual keyframes? Might not be amazing for quality though.

zoriya commented 3 months ago

Splitting when we re-encode is not a problem since we control both keyframes & segments. The real issue is when we transcode (aka -c:v copy). We cannot specify keyframes manually and must relly on the existing ones.

I saw mp4box could be useful (especially w/ #542 in mind) but using another tool means having to decode (maybe even encode) each segment twice. This also means having to maintain two hwaccel handling.

JeWe37 commented 3 months ago

How can that be a problem then? So long as the muxed(pristine) case matches the transcoded cases all is well? Therefore so long as one can determine in advance how the muxed case will be split, one just needs to make sure the transcoded cases, where we do have full control, are split the same? What is the troublesome constraint?

As for MP4Box, yea, I was thinking fmp4 already, but I don't think it really presents much additional value compared to ffmpeg for the application here. I don't see how it would cause any double encoding/deciding, it can just handle the muxing alone, see that example I gave of how to pipe between it and ffmpeg.

zoriya commented 3 months ago

I thought about it again and it could work AS LONG AS we can reliably seek/start transcoding at a specific keyframes. Right now, we seek before the keyframes we are interested in and rely on the -f segment cutting it where we want.

-ss seek is reliable for transcode but in -c:v copy mode it can't only seek to keyframes and when we specify exactly a keyframe it sometimes pick the previous one (i think due to rounding). I think we could add a value to offset that but it was not the most reliable last time I checked.

I'll try again later! Thanks for your help on this, I had given up!