saucecontrol / PhotoSauce

MagicScaler high-performance, high-quality image processing pipeline for .NET
http://photosauce.net/
MIT License
589 stars 49 forks source link

Can WebRSize work with remote URL's like Azure CDN? #37

Closed ghost closed 3 years ago

ghost commented 4 years ago

In the configuration documents you say that an image folder is mandatory. Can we somehow leave the disk cache folder local and configure utilization of remote storage like Azure Blob Storage or any kind of CDN for image delivery?

If it is possible with the latest nuget release, what would the configuration look like? Basically what I want to achieve is something like this:

<img src="https://cdn.my.com/some/path/my-image?w=123" />

Thanks, Drazen

ghost commented 4 years ago

Additionally, it would be nice if I can use the same blob storage for cached images.

saucecontrol commented 4 years ago

Using a remote file store is possible if you implement a VirtualPathProvider. There is a CachingAsyncVirtualPathProvider base class defined in WebRSize that takes care of most of the details. I don't have any samples, but the code is not difficult to follow.

You would override the IsPathCaptured method to designate a local path that is served from a remote location. You then provide overrides for FileExistsAsyncInternal and GetFileAsyncInternal so that WebRSize is able to check for the existence of a file and fetch it if necessary for processing.

There are a number of challenges with storing your images on remote storage, however. The main issue is speed (both latency and throughput). In order for WebRSize to know what size and format to serve and what processing is required, it needs to know some basic metadata from the header of the image. Checking for this metadata from a local file is very fast (<1 ms typically), whereas checking the metadata from a remote file will take hundreds to thousands of times as long. Similarly, when an image is needed for processing because the processed version is not already cached, that image must be downloaded from remote storage, which will be comparatively very slow. The download time will, in most cases, dwarf the processing time. For those reasons, it is preferable to cache files locally when they are sourced from remote storage so that subsequent processing requests for the same base image will be fast.

Storing the cached images on remote storage only exacerbates the problem, because checking to see if a cached image exists may take longer than processing a new one. The simple solution there is to use an edge-caching CDN, so it can serve the requests directly and only forward a request to your server when a new variant is needed.