[WIP][RFC] Ordered-chapters walkthrough/guide

OrangeChannel commented 4 years ago

This is still unfinished, but figured might as well post this here for more visibility for people not in the Discord.

TODO:

[ ] completely re-write the whole thing with the libbluray advice from thebombzen :disappointed:
[ ] fix up the TODOs in the md
[ ] explain editions
[ ] use editions to swap between OP/ED and NCOP/NCED
[ ] explain how to create a playall file

TODO in actual script:

[x] Allow handling of both audio and video while trimming/splicing
[x] Write chapter timestamps to valid XML
[ ] Find a better way to support VFR input clips (probably should require timecodes file with fallback on acsuite's implementation)
[ ] Re-write tests to cover all new functionality
[ ] Check that multiple OP/ED work as expected (should but just incase because this isn't in the current tests)

Blockers:

[ ] mpv # 7963 about playall files not loading attachments properly
[ ] vapoursynth audio support branch for new frames method for VFR clips and possibly some audio stuff

OrangeChannel commented 4 years ago

Did an initial review that took me much more time than I hoped to spend on it.

I generally miss a short paragraph or at least a TL;DR on why one would even want to use Ordered Chapters. It is very important to provide context to decisions and not just explain how to apply them.

@FichteFoll

Ordered-chapters are a special feature of mkv files that allows the player to reference other mkv files in the same folder to create a virtual combined timeline.

The example use cases we will be looking at starts with the ability to cut out the OP/ED from each episode in a series (assuming the OP/ED stays consistent throughout the show). This comes with at least four benefits that are unachievable using normal chapters:

We avoid having to encode the same OP/ED sequence for each episode saving us encoding time.
We can use different audio/video settings for the OP/EDs than for the main episodes.
- Using FLAC for the music and a more compressible lossy codec for the main episode audio tracks will save space and satisfy audiophiles that want lossless music.
- Using a much lower CRF value for the OP/ED will still likely save space as we are only encoding each once.
We can create a 'playall' file that creates a virtual timeline of all of the main episodes, skipping previews and only playing the OP/ED once over the entire timeline.
Using editions, we can allow the viewer to switch between the no-credit (NC) versions of the OP/ED and the normal ones.

You think this is too long?

OrangeChannel commented 4 years ago

@FichteFoll Regarding noteboxes, I can't find a plugin so I made a new class with some basic CSS (the dashed border is because I can't decide on colors for each theme right now). The last (white theme) pic shows 100% width forced. In theory, it would be slightly different color than the background and there'd only be the border on the left. It uses the font-awesome font that we seem to already be loading on the website for other things.

EDIT: here's a rough mobile design

OrangeChannel commented 4 years ago

Just for reference:

.infobox {
    display: inline-flex;
    border-left: .4em solid;
    border-top: .1em dashed;
    border-bottom: .1em dashed;
    border-bottom-left-radius: .2em;
    border-top-left-radius: .2em;
    border-bottom-right-radius: .2em;
    border-top-right-radius: .2em;
    border-color: inherit;
    opacity: 80%;
    border-right: .1em dashed;
    padding-left: 0;
    padding-top: .5em;
    padding-bottom: .5em;
    padding-right: .5em;
    margin-top: -.25em;
    margin-bottom: -.25em;
    background: 0 0;
    width: 100%
}

.infobox p {
    padding-top: .5em;
    font-size: .85em;
    font-style: italic;
    margin-bottom: 0;
    padding-bottom: .5em
}

.infobox::before {
    content: "";
    display: inline-flex;
    font: 14px/1 FontAwesome;
    font-size: 1.75em;
    margin: 0 .5em 0 .8em;
    align-self: center
}

<div class="infobox"><p>
If you are following along with the next few steps,
you don't need to extract the audio just yet.
The audio-cutting script will apply this FFmpeg command to the files
automatically.
</p></div>

FichteFoll commented 4 years ago

You think this is too long?

No, I think that's great.

Info box also looks good, except for entire text being in italics and noticeably smaller font size. I would definitely prefer using the normal font size or at least larger than that and I'm not a fan of using italics like this because it makes the strokes thin and harder to read with no added benefit. We already visually mark the entire box, so we wouldn't need to do it for the text as well.

For the sake of reviweing the PR and general workflow, please don't force-push already reviewed commits, so I can focus on the changes after than and don't have to rely on Github providing direct comparisons with the lost head.

I'll review the changes later.

OrangeChannel commented 4 years ago

You might've missed the conversation I had in Discord with thebombzen but pretty much this is getting re-re-written from scratch (it was in the checklist at the top of the PR before you reviewed) so a lot of the stuff you reviewed is probably going to become irrelevant as I continue working on this. But if it makes it cleaer I'll just commit on top of the previous for now on.

Edit: I'm going to send the notebox changes in another CSS-focused PR once I fix it up. Will mention its formatting in the CONTRIBUTING doc.

FichteFoll commented 4 years ago

All right, I'll hold off until you give me a clear sign it's ready for re-review.

averms commented 4 years ago

Please don't tell people how to use ordered chapters :(

stuxcrystal commented 4 years ago

What's the status here?

OrangeChannel commented 4 years ago

Have to clean up the code of the ordered-chapters automation script a bit before I get to this, but I should probably be able to get around to this next week. Kind of forgot about this after I became too lazy to re-write with the BDMV info, and then got distracted by a million other things :laughing: .

OrangeChannel commented 3 years ago

So, the script (https://github.com/OrangeChannel/ocsuite) can now handle

Trimming + splicing the audio for all episodes and repeated chapters
Trimming + splicing the source clip for all episodes and repeated chapters
Generating SUIDs for the repeated chapters
Writing the chapters to valid Mastroka XMLs including adjusted timestamps and references to external segments using the generated SUIDs

so I can at least start on the re-write of Fluff's blog post now.

However, https://github.com/mpv-player/mpv/issues/7963 is a thing that petzku brought up yesterday and I'll be waiting to see if this is a bug or not or if something should be done about it before updating the Caveats, Assorted Traps and Various Interesting Things to Note section since some of this stuff is outdated or doesn't apply to mpv.

EDIT: All of this stuff is limited to constant-frame-rate sources though sadly, the logic behind converting frame numbers to timestamps isn't difficult for VFR, but doing so while cutting out sections (repeated/ordered chapters) is not feasible really.

Lypheo commented 3 years ago

EDIT: All of this stuff is limited to constant-frame-rate sources though sadly, the logic behind converting frame numbers to timestamps isn't difficult for VFR, but doing so while cutting out sections (repeated/ordered chapters) is not feasible really.

Why not, exactly? Couldn’t you just extract the timestamps and look them up by frame number instead of calculating them?

OrangeChannel commented 3 years ago

Actually, since the script now returns the trimmed/spliced clips, vfr chapter timestamps are possible with vspipe's timecodes files, but is there a way to do this without creating a temporary script for each clip and then reading from a timecode file? I can technically do this in a really messy way with .get_frame(n)['_DurationNum'] without breaking out of the script but this would probably be pretty complicated/ugly. @Lypheo is there a better way to get timestamps from a vfr vs clip?

Lypheo commented 3 years ago

Not sure exactly how ocsuite operates but maybe sth like this?

src = core.ffms2.Source(r"anime_tiddies.mkv", timecodes="tc.txt")
tc = [int(float(x)) for x in open("tc.txt", "r").read().splitlines()[1:]]

def fu(n,f):
    fout = f.copy()
    fout.props["TimeStamp"] = tc[n]
    return fout

o = core.std.ModifyFrame(src, src, fu)

OrangeChannel commented 3 years ago

I've implemented a pure Python solution in acsuite for now

def f2ts(f: int, /, *, precision: int = 3, src_clip: vs.VideoNode) -> str:
    """Converts frame number to a timestamp based on framerate.
    Can handle variable-frame-rate clips as well, using similar methods to that of vspipe --timecodes.
    Meant to be called as a functools.partial with 'src_clip' specified before-hand.
    """
    if precision not in [0, 3, 6, 9]:
        raise ValueError(f"f2ts: the precision {precision} must be a multiple of 3 (including 0)")
    if src_clip.fps != fractions.Fraction(0, 1):
        t = round(10 ** 9 * f * src_clip.fps ** -1)
        s = t / 10 ** 9
    else:
        s = clip_to_timecodes(src_clip)[f]

    m = s // 60
    s %= 60
    h = m // 60
    m %= 60

    if precision == 0:
        return f'{h:02.0f}:{m:02.0f}:{round(s):02}'
    elif precision == 3:
        return f'{h:02.0f}:{m:02.0f}:{s:06.3f}'
    elif precision == 6:
        return f'{h:02.0f}:{m:02.0f}:{s:09.6f}'
    elif precision == 9:
        return f'{h:02.0f}:{m:02.0f}:{s:012.9f}'

@functools.lru_cache
def clip_to_timecodes(src_clip: vs.VideoNode) -> collections.deque:
    """Cached function to return a list of timecodes for vfr clips."""
    timecodes = collections.deque([0.0], maxlen=src_clip.num_frames + 1)
    curr_time = fractions.Fraction()
    for frame in src_clip.frames():
        curr_time += fractions.Fraction(frame.props['_DurationNum'], frame.props['_DurationDen'])
        timecodes.append(float(curr_time))
    return timecodes

as this should be used only on source filter inputs, the speed is really limited by the current implementation of VideoNode.frames() which according to stux, will be much faster once the pending frames PR gets merged.

The speed is not really great (~50 seconds to get 100 timestamps from a 1 million long frame vfr clip created by core.std.BlankClip) and deque only seems to improve performance by around 2% at most, but this will hopefully be a lot faster when the new frames generator is implemented.

Lypheo commented 3 years ago

But why though 🤔 . With ffms2's timecodes, you incur no speed penalty at all, and afaict it should be identical in funtionality.

OrangeChannel commented 3 years ago

But why though :thinking: . With ffms2's timecodes, you incur no speed penalty at all, and afaict it should be identical in funtionality.

For the audio cutting script this is actually part of: Hmmm, this does require the vfr clip to be directly read from a file though, so wouldn't work for clips that somehow become vfr via vapoursynth filtering, but thinking about it... this shouldn't really ever be a case since this is for cutting audio which will likely be synced with the source anyways.

Also, the way to implement this would require the user to supply a timecodes file that they would need to generate (assuming they have ffms2 installed and loaded), as the current functionality takes in a VideoNode as input, not a filename I could open with ffms2.Source(). I could make it so a provided timecodes file will take precedence over the clip_to_timecodes function, only using it as a fallback if only a clip is given. For the ordered chapters script, I could also just make the audio and write_to_xml methods take a timecodes filename as a argument, but I don't know if I want to force the user to re-generate an index + a timecodes file if they've used a different source filter to get the vfr clip in the first place.

For the chapter timestamp part of the ordered chapters creation script: Now, on to the real reason using existing frame props here in acsuite is actually a decent solution for the ordered chapters script: Once I have the vfr clips split along chapter boundaries, I stitch them back together using std.Splice, allowing me to just grab frame numbers and timestamps from the already adjusted clips, without having to do the math to adjust frame timestamps to begin at zero and be continuous across cut-out repeated chapters (although technically this wouldn't be that difficult, I think).

OrangeChannel commented 3 years ago

Just did some testing with an actual mkv clip (34k frames loaded via ffms2), and it is unusably slow :disappointed: , also noticed that ffms2 timecodes round to nearest ms, which is not ideal for chapter timings but I'm not sure the precision really matters at all at that level. Assuming the frames PR even gives this a massive speed-up, it would have to be like 2 orders of magnitude faster than it is now for this to be usable on very long clips (i.e. merged clips from a BDMV). Also found out Splice complains when trying to add a BlankClip with a loaded clip with different frame rates (using the + operator) but not for two clips with differing framerates created by std.BlankClip :thinking: . Looks like using a timecodes file is going to be necessary for the ordered chapters script, so I'm going to start refactoring stuff now so the timestamp calculation doesn't just use clip.fps and the functools.reduce operation won't error out for differing framerate segments.

Lypheo commented 3 years ago

Also found out Splice complains when trying to add a BlankClip with a loaded clip with different frame rates (using the + operator) but not for two clips with differing framerates created by std.BlankClip 🤔

The loaded clip probably differed in format or dimensions. There's a bug in Splice that I never bothered to report which causes it to state fps mismatch as the error cause even though the format/res was the problem.

OrangeChannel commented 3 years ago

The loaded clip probably differed in format or dimensions. There's a bug in Splice that I never bothered to report which causes it to state fps mismatch as the error cause even though the format/res was the problem.

Ahh, that makes a lot more sense actually, still giving Splice a list of segments is actually cleaner looking than a functools.reduce(operator.add, segments) anyways so I went ahead and changed it.

I still can't think of a good way to use timecodes files to calculate the new frame times once the repeated chapters are cut out besides the method I'm using now, which turns out to be extremely slow. VFR clips are now technically supported in both the audio and ordered chapters scripts, though it will be very slow to write the chapters to xml if the clip is vfr.

Irrational-Encoding-Wizardry / guide.encode.moe

[WIP][RFC] Ordered-chapters walkthrough/guide #54

Using editions, we can allow the viewer to switch between the no-credit (NC) versions of the OP/ED and the normal ones.