Artoriuz / glsl-chroma-from-luma-prediction

CfL as a GLSL shader
MIT License
49 stars 2 forks source link

Cleaner gather/texture code #11

Closed deus0ww closed 11 months ago

deus0ww commented 12 months ago

While converting another shader to gather, I rewrote the gather/texture code for CfL. It's much more concise, I think.

#ifdef HOOKED_gather
    vec2 pos = fp * HOOKED_pt;
    const ivec2 gatherOffsets[4] = {{ 0, 0}, { 2, 0}, { 0, 2}, { 2, 2}};
    vec4 chroma_quads[2][4];
    vec4 luma_quads[4];
    for (int i = 0; i < 4; i++) {
        chroma_quads[0][i] = HOOKED_mul * textureGatherOffset(HOOKED_raw, pos, gatherOffsets[i], 0);
        chroma_quads[1][i] = HOOKED_mul * textureGatherOffset(HOOKED_raw, pos, gatherOffsets[i], 1);
        luma_quads[i] = LUMA_LOWRES_gather(vec2((fp + gatherOffsets[i]) * HOOKED_pt), 0);
    }
    float luma_pixels[12] = {
        luma_quads[0].z, luma_quads[1].w,
        luma_quads[0].x, luma_quads[0].y,
        luma_quads[1].x, luma_quads[1].y,
        luma_quads[2].w, luma_quads[2].z,
        luma_quads[3].w, luma_quads[3].z,
        luma_quads[2].y, luma_quads[3].x};
    vec2 chroma_pixels[12] = {
        {chroma_quads[0][0].z, chroma_quads[1][0].z}, {chroma_quads[0][1].w, chroma_quads[1][1].w},
        {chroma_quads[0][0].x, chroma_quads[1][0].x}, {chroma_quads[0][0].y, chroma_quads[1][0].y},
        {chroma_quads[0][1].x, chroma_quads[1][1].x}, {chroma_quads[0][1].y, chroma_quads[1][1].y},
        {chroma_quads[0][2].w, chroma_quads[1][2].w}, {chroma_quads[0][2].z, chroma_quads[1][2].z},
        {chroma_quads[0][3].w, chroma_quads[1][3].w}, {chroma_quads[0][3].z, chroma_quads[1][3].z},
        {chroma_quads[0][2].y, chroma_quads[1][2].y}, {chroma_quads[0][3].x, chroma_quads[1][3].x}};
#else
    const vec2 texOffsets[12] = {
        { 0.5,-0.5}, { 1.5,-0.5}, {-0.5, 0.5}, { 0.5, 0.5}, { 1.5, 0.5}, { 2.5, 0.5},
        {-0.5, 1.5}, { 0.5, 1.5}, { 1.5, 1.5}, { 2.5, 1.5}, { 0.5, 2.5}, { 1.5, 2.5}};
    vec2 chroma_pixels[12];
    float luma_pixels[12];
    for (int i = 0; i < 12; i++) {
        chroma_pixels[i] = HOOKED_tex(vec2((fp + texOffsets[i]) * HOOKED_pt)).xy;
        luma_pixels[i] = LUMA_LOWRES_tex(vec2((fp + texOffsets[i]) * HOOKED_pt)).x;
    }
#endif
    vec2 chroma_min = min(min(min(min(vec2(1e8 ), chroma_pixels[3]), chroma_pixels[4]), chroma_pixels[7]), chroma_pixels[8]);
    vec2 chroma_max = max(max(max(max(vec2(1e-8), chroma_pixels[3]), chroma_pixels[4]), chroma_pixels[7]), chroma_pixels[8]);
deus0ww commented 11 months ago

I have rewritten most of the shader code... so I'll close this.

see here: https://github.com/deus0ww/mpv-conf/blob/master/shaders/bilateral/CfL_Prediction.glsl

The upscaling pass is about 15% faster.

Jules-A commented 11 months ago

Nice!! Thanks, I'll give it a shot :)

EDIT: Just tested it, it's about 5% slower on my system (r5 3600, 6700XT). Note, your branch is using the old downscaling, I tested with the version in master. If it's just 12tap: ~6156 vs 5804 (master).