cshum / imagor

Fast, secure image processing server and Go library, using libvips
Apache License 2.0
3.37k stars 132 forks source link

[Question] Benefit of `Result Storage` when running `imagor` behind a CDN #387

Closed steve-marmalade closed 1 year ago

steve-marmalade commented 1 year ago

If I deploy imagor behind a CDN, is there any benefit of configuring Result Storage? My expectation is that subsequent requests for the same URL will have been cached by the CDN, and so the request won't even make it to the imagor service.

I am new to image serving infrastructure, so apologies for the basic question.

cshum commented 1 year ago

Result Storage is kinda the equivalent of caching, so yes a CDN would be more well suited for caching the processed image.

However some people had wanted to retrieve the processed image files, where Result Storage comes in handy other than caching. So Result Storage is just one of the option, and the benefits depends on the use case.

steve-marmalade commented 1 year ago

Hi @cshum , thank you for the response, that all makes sense. Does writing to Result Storage impact latency for the first requester of an asset, or does it happen in the background?

kiithwarrior commented 1 year ago

Hope you don't mind me high jacking this question, but mine is of a similar thread.

I'm considering moving away from thumbor and saw this project. The only question i have is if you support Redis for the result storage, or if you have plans to add it in the future?

jvahldick commented 1 year ago

Hi @steve-marmalade

I would like to give my opinion regarding your first question. In my perspective, the ResultStorage is quite useful, and I'm using imagor behind the CDN. Imagine the scenario where you have a multi-layer processing for the images. Eg: image with 5 watermarks

So, filters(watermark:...) of a filters(watermark:...) At some point, it fails. Unfortunately, I do not remember exactly all the issues I faced. I think one was related to not being able to have multiple parameters or something like this. The other one was related to the S3 limiting it (https://github.com/cshum/imagor/issues/153)

Another benefit I see on the ResultStorage is that it looks for the images on the bucket, so we do not necessarily need to have quite long URLs. So, instead watermark(https://full-url/image.jpg) you can have /image.jpg

Regarding perfomance, I don't think it has any impact. https://github.com/cshum/imagor/blob/c11b178d4544ed1cd2870fd33fe57a39840dc1a9/imagor.go#L423-L425C4 https://github.com/cshum/imagor/blob/c11b178d4544ed1cd2870fd33fe57a39840dc1a9/imagor.go#L444-L460

Did you try to run it without setting the ResultStorage? Unless it is required, I do not really think it has an impact there.


@kiithwarrior The CDN would do a better job IMHO. If for some reason you cannot use a CDN, I think one option is to proxy imagor, and if the request is sucessful you can know the name of the ResultStorage filename. I'm doing something similar

PHP example:

private const PATH_REGEX =
        '\/*' .
        // meta
        '(meta\/)?' .
        // trim
        '(trim(:(top-left|bottom-right))?(:(\\d+))?\/)?' .
        // crop
        '(((0?\\.)?\\d+)x((0?\\.)?\\d+):(([0-1]?\\.)?\\d+)x(([0-1]?\\.)?\\d+)\/)?' .
        // fit-in
        '(fit-in\/)?' .
        // stretch
        '(stretch\/)?' .
        // dimensions
        '((\\-?)(\\d*)x(\\-?)(\\d*)\/)?' .
        // paddings
        '((\\d+)x(\\d+)(:(\\d+)x(\\d+))?\/)?' .
        // h_align
        '((left|right|center)\/)?' .
        // v_align
        '((top|bottom|middle)\/)?' .
        // smart
        '(smart\/)?' .
        // filters
        '(filters:(.+?\\))\/)?' .
        // image
        // '(.+)?'
        '(?P<imagePath>.+)?'
    ;

// We need to get the Imagor secret and create the suffix which is generated by imagor when creating images
// eg: /path/original.{suffix}.extension
$path = $uri->getPath();

$digest = sha1(ltrim($path, '/'));
$hash = substr($digest, 0, 20);

$endpoint = $this->concatenateUriParts(
            ...,
            ...,
            $this->createSuffixedImagePathHash($path, $hash),
        );

function createSuffixedImagePathHash(string $path, string $hash): string
    {
        $imagePath = $this->getImagePathFromPath($path);

        $fileExtension = pathinfo($imagePath, PATHINFO_EXTENSION);
        if ($fileExtension === '') {
            return sprintf('%s.%s', $imagePath, $hash);
        }

        $dotIdx = strrpos($imagePath, sprintf('.%s', $fileExtension));
        $filenameWithoutExtension = substr($imagePath, 0, $dotIdx);

        // In case there is a format filter, we should look over it as the extension of the file changes
        $pattern = "/.*format\((.*?)\)/";
        preg_match($pattern, $path, $output);

        if (!empty($output[1]) && $output[1] !== '') {
            $fileExtension = $output[1];
        }

        return sprintf(
            '%s.%s.%s',
            $filenameWithoutExtension,
            $hash,
            $fileExtension
        );
    }

function getImagePathFromPath(string $path): string
    {
        preg_match('/' . self::PATH_REGEX . '/', $path, $matchedValues);
        if (key_exists('imagePath', $matchedValues) && $matchedValues['imagePath'] !== '') {
            return $matchedValues['imagePath'];
        }

        return $path;
    }
steve-marmalade commented 1 year ago

Thanks for the thoughtful response @jvahldick . We have been using RESULT_STORAGE and I agree that it is useful even when serving behind a CDN, for many of the reasons you've mentioned. I'll close this issue.