Open tkozybski opened 3 years ago
Wrong label...
This is very similar to what I'm pondering on these days. I was thinking about composing a StaxRip embedded PowerShell script that generates an I-frame index list. I found an obvious downside - too long process time - in using ffprobe
for this purpose, so I turned to the DGIndexNV
index file, dgi
, instead to get the desired result. See here.
But using the scene
option in ffmpeg
as proposed by this post seems fit for more general use cases. Extracting the PTS time is not difficult. You just need to run the following DOS command line with INPUT
to get OUTPUT.txt
containing the result.
ffmpeg -hide_banner -i INPUT -vf "select='gt(scene,0.4)',metadata=print:file='OUTPUT.txt'" -f null NUL
OUTPUT.txt
looks like this:
frame:0 pts:9510 pts_time:9.51
lavfi.scene_score=0.557776
frame:1 pts:15016 pts_time:15.016
lavfi.scene_score=0.691152
frame:2 pts:20021 pts_time:20.021
lavfi.scene_score=0.690279
frame:3 pts:21522 pts_time:21.522
lavfi.scene_score=0.532986
frame:4 pts:23524 pts_time:23.524
lavfi.scene_score=0.537670
frame:5 pts:28529 pts_time:28.529
lavfi.scene_score=0.619934
...
What is difficult, though, is how we can put it to work in StaxRip using the generated pts_time
info. We need to strip unnecessary part and convert pts_time
to a workable format like HH:MM:SS.nnn
. But since there's already a tool that does this - PySceneDetect - maybe we better find a way to make use of it. As you may know, Av1an
is also utilizing this tool to get the cut info.
That said, another big hurdle is in place. Currently StaxRip is using frame number info (evenly divided total frame numbers) to put it directly in each encoder's parameters that are used for chunk encoding. But in order to adopt this new tool, an overhaul of the code is inevitable since every chunk encoding should be done via mkvextract
or ffmpeg
to match the cut timecodes, not frame numbers. I think this is really a big matter and will take a lot of time. Big food for thought. π
Last but not least, there's a critical problem with this ffmpeg
- scene
option approach: it fails on some sources. For example, this Dolby Vision trailer - Chameleon.m2ts on Dolby Trailers - does not work well with this method even after the m2ts
file is remuxed to mkv
. It yields this error message and OUTPUT.txt
file is simply empty.
[hevc @ 00000153cd5aca00] Invalid NAL unit 36, skipping.
I don't know if PySceneDetect is free of this kind of issues, but if not, then it's not reliable to use for general purposes. That's a big hurdle. π€
I wonder if the index file created by ffms2 and L-Smash-Works contains info about I-Frames (I guess so) and if the format of the index file is easy to understand. It could not only be useful for chunk encoding, but also for cutting without re-encoding.
@stax76, thatβs right. Iβm wondering if the authors are willing to change the format. Hmm...
Probably not. Vapoursynth is modern and powerful, generally has rich metadata support, so a source filter could provide this info so that it can be accessed with the vapoursynth API, maybe it's already supported, or it can be requested from ffms2, l-smash and dgdecnv. But reading it from the index file would be significantly faster, it would not require requesting all frames, maybe the index format isn't so complex.
From my experience... Don't try to split and merge open-gop hevc streams, it will produce bad things in result.
Yeah, esp. in stream copy. In that respect, I-frame list or scene-detected frame(timecode) list alone may raise an issue for stream copy cutting with open-GOP stream structures like HEVC.
Since chunk encoding also involves stream copy cutting (either by the encoder itself at frame indexes, or via mkvextract
/ffmpeg
for timecode-based cutting), it may raise an issue in the same vein. π
So at this point, another issue comes up. Can we extract only IDR frames which have good scene
values? To do that, maybe we need to include another criterion that identifies whether a given frame is IDR or not. Food for thought. π€
On second thought, frame index cutting by the encoder may not be a problem.
Since the encoder receives decoded frames served by the frameserver via an avs
/vpy
script, it's not a stream copy cutting.
OTOH, cutting by mkvextract
or ffmpeg
does not involve any prior decoding process, so it's basically a stream copy cutting.
Therefore, it seems that timecode-based cutting for chunk encoding raises another issue in this regard. Hmm...
Any update news on this.
Presently I use a roundabout way of chunk at scene change.
I do wonder if this could be automatically processed?
By splitting the frames evenly between chunks they will start/end in the middle of the scene, lowering compression efficiency and/or quality. I propose to add functionality to detect scene changes and split the chunks based on that. See here or here on how to do this.
For aomenc, first pass stats file could be parsed to get the keyframes for 100% accuracy thus in theory improving quality & parallelism at the same time (by not using multi threading options and encode in chunks instead). Av1an does that.