gyroflow / gyroflow

Video stabilization using gyroscope data
https://gyroflow.xyz
GNU General Public License v3.0
6.17k stars 261 forks source link

Better scale quality? #780

Open temp-64GTX opened 4 months ago

temp-64GTX commented 4 months ago

Is there an existing feature request for this?

Description

Good evening. Me again.

Is it possible it implement a better scale quality option? Typical case: i have a 4k video, open'd it in Gyroflow, apply stabilisation, and export to a 720p ProRes proxy, for fast editing. And this 720p video looks very "aliased". Like in games, you know, when you turn AA completely off. For example, this picture. 4k video. Upper one exported from Gyroflow in 720p. Lower one exported from Gyroflow in native 4k, and downscaled it video-editor. It's standard scale-processing, "bilinear", or something. So we can clearly see the difference. Post (tall metal white tiny thing) in the center - upper one has this typical "stairs-edges". Same "stairs-edges" we can see on the wires, and on the vertical lines of beige building at the right. Meanwhile on the lower picture - all these lines are softer and smoother.

scale

And in dynamic video it is much more visible, than in a static picture. So i think better scale quality will be great. At least for exporting videos. I guess bilinear works fine. Also i am using "Lanczos" in XnView for scaling pictures - it gives a good quality.

AdrianEddy commented 4 months ago

The export scaling in Gyroflow is Lanczos4 already. Preview in Gyroflow is Bilinear Bicubic is also implemented but there's no option to select it from the UI. I can add the selector, but I feel like there's something else going on in your example than Gyroflow scaling

temp-64GTX commented 4 months ago

Here is video-comparison. Youtube eats video-quality, but difference are still visible.

https://www.youtube.com/watch?v=myNK5hb4Wek

AdrianEddy commented 4 months ago

Added the selector in 7b48ed3, in Export settings -> Advanced

temp-64GTX commented 4 months ago

Selector is great, but nothing is changed :( All 3 methods giving almost the same results with "stairs-edges" :(

AdrianEddy commented 4 months ago

yeah that's why I think there's something else going on, please do more testing:

temp-64GTX commented 4 months ago

Short Results:

All of different (png, prores, h265, etc.) export codecs with 720p : stairs-edges. 4k & FOV 4: stairs-edges. 4k & FOV 0.25: Smooth Edges. DaVinci Resolve: crashed three times on exporting process. Then i just removed it to hell, where it came from. "use DaVinci", they said. "it fast and simple", they said.

Снимок

temp-64GTX commented 4 months ago

Ok, i tried Fusion Studio. it's more or less working, but i can't understand how to tell the gyroflow that it must do a downscale. Because now it is Fusion doing it.

Снимок

AdrianEddy commented 4 months ago

In Gyroflow app, set Export size to 720p, and save that in the project. Then in Fusion, go to Gyroflow plugin settings and check "Use plugin RoD for output size" image

temp-64GTX commented 4 months ago

Yep. It's also giving the stairs-edges.

AdrianEddy commented 4 months ago

Can you send me the sample file?

temp-64GTX commented 4 months ago

i hope link is working. In the video, this effect will be very noticeable at the beginning - bridge lines and fence lines. https://drive.google.com/file/d/1QzovqTtn9jC0AiDLUIeQSxdAH_ByFXEs/view?usp=sharing

temp-64GTX commented 3 months ago

Meanwhile, i've tested mobile version of gyroflow. To exclude any possibility of impact the result by my particular pc-hardware. (And result is same) I put original gopro file and all 3 output files to the link above.

hawku commented 3 weeks ago

I have the same problem with GoPro videos. It looks like there isn't any filters used for scaling. Scaling from 4k to 8k looks good but from 4k to 720p is completely unusable. I have to export 8k and scale that down with ffmpeg to get decent output from Gyroflow.

8k export and scaled to 720p: gyroflow8k720p

720p export: gyroflow720p

hawku commented 3 weeks ago

I still can't reproduce this issue, the 720p exports look perfectly fine for me, and of course there are filters used when exporting, and high quality Lanczos4 is the default

Are you using high enough FOV? Examples from footage provided by @temp-64GTX with FOV set to 3

8k export to 720p: 8k720p

720p export: gyroflow720p

AdrianEddy commented 3 weeks ago

Ok I see it Since our scaling code is ported from OpenCV, we've inherited the scaling algorithm that is not entirely correct:

We should look into changing our scaling code to be based on pillow's: https://github.dev/zurutech/pillow-resize/blob/main/src/PillowResize/PillowResize.cc They are pretty similar from the first glance, so looks like we need to dig deeper and identify what's exactly different. Not trivial, but also not terrible

AdrianEddy commented 3 weeks ago

/bounty $300

algora-pbc[bot] commented 3 weeks ago

💎 $300 bounty • Gyroflow

Steps to solve:

  1. Start working: Comment /attempt #780 with your implementation plan
  2. Submit work: Create a pull request including /claim #780 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Additional opportunities:

Thank you for contributing to gyroflow/gyroflow!

Add a bountyShare on socials

AdrianEddy commented 2 weeks ago

I've made some progress, I re-implemented the pillow's algorithm in Rust with a minimal example, comparing OpenCV's implementations (the code currently in Gyroflow) and Pillow's implementation.

The Pillow's one resizes the images better, however, the way it's structured is that it calculates coefficients up front for resizing from input to output. This is a problem, because in Gyroflow we need to feed input coordinates to the resampling function, because they will be rotated (stabilized), so it's different than simple resizing (where coordinates map in a simple linear way from input to output)

Example project: resizing.zip

there's pillow::sample_at_output, which uses precomputed coefficients, but we can't use it as our source image coordinates can't be precomputed (because they are calculated in the gpu kernel per pixel)

I've implemented pillow::sample_input_at, but this one doesn't use any precomputed coeffcients, and instead calls the resampling function right there for every sampled pixel, which will be slow.

I'm reducing the bounty to $150 since most of the work is done, just need to figure out a way how to precompute the coefficients for that use case. The tricky thing is that it depends on scale (which ideally should be input_size/output_size * fov, but it might be enough to have input_size/output_size, which would then be static for the whole shader)

ElvinC commented 2 weeks ago

Not sure if this would work, but what if a few pixels (say four in a square) at the center of the output image are sampled to get the input pixels coordinates (for determining the scaling factor), which is then used to pre-compute the coefficients for the whole image? Or is that also too slow? There are some edge cases though, particularly with varying scales when the frame is very tilted, resulting in different pixel scales across the image

AdrianEddy commented 2 weeks ago

I implemented it without precomputed coefficients in order to test, and the rendered video is now much nicer, check out the files there: https://drive.google.com/drive/folders/1brliOo0b4RLHOKbhraUBvIyRtsMIS-uj?usp=sharing

However, this improves things only when downscaling (massive improvement), but for regular videos (where there is mostly slight upscaling - see __GH011230_stabilized.mp4), I don't see any difference, and I've been pixel peeping pretty hard

This kinda makes sense, because the main difference between these implementations is the the sampling area is scaled up (when resizing down), but it's never scaled down (when resizing up), so the case where we upscale the video should be pretty much equivalent between current and this one

This implementation is 2-3x slower to render

Not sure if this would work, but what if a few pixels (say four in a square) at the center of the output image are sampled to get the input pixels coordinates (for determining the scaling factor)

This is a good idea, but I think this new implementation only makes sense if we use the fov (which changes per frame because of dynamic zoom), and precomputing coeffs for every frame might be too much memory (and transferring it to the gpu). It would have to be benchmarked though

temp-64GTX commented 2 weeks ago

Well, stairs is gone. However, new one is kind of blurry. Снимок (left - AfterEffects downscale, right - new downloaded from your google-drive)

AdrianEddy commented 2 weeks ago

indeed hmm, maybe this really needs to be a 2-stage process, first stabilize and then resize doing it in one step sounds good on paper, but apparently has it's limitations

AdrianEddy commented 2 weeks ago

For reference, here's the wgsl implementation if anyone wants to play with it

fn bilinear_filter(x_: f32) -> f32 { let x = abs(x_); if x < 1.0 { return 1.0 - x; } else { return 0.0; } }
fn hamming_filter(x_: f32) -> f32 { var x = abs(x_); if x == 0.0 { return 1.0; } else if x >= 1.0 { return 0.0; } else { x = x * 3.14159265359; return (sin(x) / x) * (0.54 + 0.46 * cos(x)); } }
fn bicubic_filter(x_: f32) -> f32 { let x = abs(x_); let A: f32 = -0.5; if x < 1.0 { return ((A + 2.0) * x - (A + 3.0)) * x * x + 1.0; } else if x < 2.0 { return (((x - 5.0) * x + 8.0) * x - 4.0) * A; } else { return 0.0; } }
fn sinc_filter(x: f32) -> f32 { if x == 0.0 { return 1.0; } else { let xx = x * 3.14159265359; return sin(xx) / xx; } }
fn lanczos_filter(x: f32) -> f32 { if x >= -3.0 && x < 3.0 { return sinc_filter(x) * sinc_filter(x / 3.0); } else { return 0.0; } }

fn sample_input_at2(uv_param: vec2<f32>) -> vec4<f32> {
    let filter_support = 3.0;
    let scale = min(params.fov, 10.0);
    let filter_scale = max(scale, 1.0);
    let support = filter_support * filter_scale;
    let ss = 1.0 / filter_scale;
    var kx = array<f32, 64>();
    var ky = array<f32, 64>();

    let fix_range = bool(flags & 1);

    let bg = params.background * params.max_pixel_value;
    var sum = vec4<f32>(0.0);

    var uv = uv_param;
    if (params.input_rotation != 0.0) {
        uv = rotate_point(uv, params.input_rotation * (3.14159265359 / 180.0), vec2<f32>(f32(params.width) / 2.0, f32(params.height) / 2.0));
    }

    if (bool(flags & 32)) { // Uses source rect
        uv = vec2<f32>(
            map_coord(uv.x, 0.0, f32(params.width),  f32(params.source_rect.x), f32(params.source_rect.x + params.source_rect.z)),
            map_coord(uv.y, 0.0, f32(params.height), f32(params.source_rect.y), f32(params.source_rect.y + params.source_rect.w))
        );
    }

    ////////////////////////////////
        let xcenter = uv.x + 0.5 * scale;
        let xmin = i32(floor(max(xcenter - support, 0.0)));
        let xmax = max(i32(ceil(min(xcenter + support, f32(params.width)))) - xmin, 0);
        var xw = 0.0;
        for (var x: i32 = 0; x < xmax; x = x + 1) {
            let f: f32 = (f32(x) + f32(xmin) - xcenter + 0.5) * ss;
            kx[x] = lanczos_filter(f);
            xw += kx[x];
        }
        if (xw != 0.0) { for (var x: i32 = 0; x < xmax; x = x + 1) { kx[x] /= xw; } }
    ////////////////////////////////
        let ycenter = uv.y + 0.5 * scale;
        let ymin = i32(floor(max(ycenter - support, 0.0)));
        let ymax = max(i32(ceil(min(ycenter + support, f32(params.height)))) - ymin, 0);
        var yw = 0.0;
        for (var y: i32 = 0; y < ymax; y = y + 1) {
            let f: f32 = (f32(y) + f32(ymin) - ycenter + 0.5) * ss;
            ky[y] = lanczos_filter(f);
            yw += ky[y];
        }
        if (yw != 0.0) { for (var y: i32 = 0; y < ymax; y = y + 1) { ky[y] /= yw; } }
    ////////////////////////////////

    let sx = xmin;
    let sy = ymin;

    for (var yp: i32 = 0; yp < ymax; yp = yp + 1) {
        if (sy + yp >= params.source_rect.y && sy + yp < params.source_rect.y + params.source_rect.w) {
            var xsum = vec4<f32>(0.0, 0.0, 0.0, 0.0);
            for (var xp: i32 = 0; xp < xmax; xp = xp + 1) {
                var pixel: vec4<f32>;
                if (sx + xp >= params.source_rect.x && sx + xp < params.source_rect.x + params.source_rect.z) {
                    pixel = read_input_at(vec2<i32>(sx + xp, sy + yp));
                    pixel = draw_pixel(pixel, u32(sx + xp), u32(sy + yp), true);
                    if (fix_range) {
                        pixel = remap_colorrange(pixel, bytes_per_pixel == 1);
                    }
                } else {
                    pixel = bg;
                }
                xsum = xsum + (pixel * kx[xp]);
            }
            sum = sum + xsum * ky[yp];
        } else {
            sum = sum + bg * ky[yp];
        }
    }
    return vec4<f32>(
        min(sum.x, params.pixel_value_limit),
        min(sum.y, params.pixel_value_limit),
        min(sum.z, params.pixel_value_limit),
        min(sum.w, params.pixel_value_limit)
    );
}

Replaces sample_input_at

VladimirP1 commented 1 week ago

I tried implementing the same algorithm as is used in imagemagick's distort operator - elliptical weighted average with cubic bc filtering. This is just a test to evaluate performance, it does not do any real distortion, it only applies a given affine transformation https://github.com/VladimirP1/gpu-warp . To use this in gyroflow we'd have to calculate affine approximations of the transformation at each pixel of the undistorted image, which is not much harder than computing the transformation itself.

Some test transformation 4000x3000 -> downscale by 2.2 onto 1920x1080 canvas + 0.1 rad of rotation: https://drive.google.com/drive/folders/1jHVp6L73TESmmYO1VOXEmn1dAzxfqL-H?usp=sharing This exact transformation takes 54ms on UHD 630, 20ms on GTX1070 and 128ms on i9-9900k cpu PoCL

VladimirP1 commented 1 week ago

And it seems that there are some bugs left in my implementation

VladimirP1 commented 1 week ago

I seem to have fixed most bugs and tried some real warping with my code

This shows that both upscale (in the left part of image) and downscale (in the right) work out

This is just some fisheye-like distortion out

temp-64GTX commented 1 week ago

tried some real warping

hmm. looks interesting. Does it possible to implement some custom warping? Like, you now, lens distortion: https://github.com/gyroflow/gyroflow/issues/355

AdrianEddy commented 1 week ago

tried some real warping

hmm. looks interesting. Does it possible to implement some custom warping? Like, you now, lens distortion: #355

that's not related, it's a different issue

VladimirP1 commented 1 week ago

Progress so far: EWA, cubic bc filtering: https://youtu.be/egGW8EQafhc current filtering in gyroflow (lanczos4): https://youtu.be/KNUOr-IasBg

I am using numeric differentiation now (so basically running the distort transformation three times per output pixel instead of one). It is possible to do it in one step, but that would require adding jacobian calculation into lens models.