Various points of feedback

francislavoie commented 1 year ago

So I was just doing some thinking about how the storage is designed here, and I have a few points of feedback.

I've been playing with RedisInsight lately and it has an option for rendering a value as JSON if the value is valid JSON. With the default value_prefix currently of caddy-storage-redis, that breaks that functionality because it just straight up adds text to the front of the JSON, making it invalid unless stripped.

What's the reasoning behind value_prefix? Why would the value need a prefix? Scanning is done by key, so that shouldn't be relevant, I would think.

Unfortunately, simply changing this default would be a breaking change of course because then everyone's data wouldn't be read correctly anymore. But the lib could be changed to continue stripping the prefix if it exists, but start storing values without a prefix.
Currently, SCAN is used to find all keys by pattern. That's quite efficient compared to something like KEYS, but the problem arises when the same Redis DB is used for a lot of other data. That means scanning the entire key-set becomes much more expensive. For example, my DB currently has around 4 million keys in it (99.5% of it is not Caddy TLS data). Scanning the entire thing can take up to a few seconds.

I realize that choosing a different DB number would be a solution for this. But migrating from one DB to another is not viable right now because we don't have any export/import tooling yet to do this by hand.

One thing that would improve throughput is increasing the SCAN count from 100 to something like 1000 or 10000 which could crunch through that many much faster.

Even better though would be to keep a SET, HSET or ZSET of all the keys managed by this plugin, so then you can do an SSCAN, HSCAN or ZSCAN to only look at the relevant entries and avoid looking at any data that isn't managed by the plugin via SCAN.

Using a SET would just be a listing of all the keys managed by the lib, to be scanned with SSCAN instead. Simple solution to faster scanning.

Using a ZSET would be the same idea as a SET, but it would also mean that the score could be used to store the modification time of the key. That could be useful for purging data that's too old for example with ZRANGEBYSCORE or something like that. It's useful for implementing a custom expiry mechanism. Not sure if that's actually beneficial right now though.

Using an HSET would mean that the actual values could be stored in the hash set directly instead of in individual keys. HSCAN would still allow finding keys easily by pattern. This would probably be a decent simplification of the storage as well, because the key prefix would only apply to a single key instead of every key used by the lib, and then the keys inside the HSET would not need to be prefixed.

Migrating to any of these storage patterns would require a bit of code to migrate it on upgrade of course. This could be pretty simple; at startup, check if the chosen set type's key exists (which is a O(1) lookup). If it doesn't, do a SCAN to find all the keys, move them into the set (and if HSET is chosen, delete all the existing storage keys).

gamalan commented 1 year ago

@francislavoie Thanks for the feedback. I know there are some bad design, because at first I just do it because I need some storage which available for my team to use.

francislavoie commented 1 year ago

Oh no worries at all.

I'm bringing this up because I've been more heavily working with Redis with an app of mine and I decided to review how this plugin does this with that fresh perspective.

I'm thinking of making the changes I suggest in a PR if you're open to it, wanted to see if you have any opinions on those points before I get started on it.

gamalan commented 1 year ago

Cool, I am open to any suggestions that could help anyone using this plugin.

On Wed, Jan 18, 2023, 7:59 AM Francis Lavoie @.***> wrote:

Oh no worries at all.

I'm bringing this up because I've been more heavily working with Redis with an app of mine and I decided to review how this plugin does this with that fresh perspective.

I'm thinking of making the changes I suggest in a PR if you're open to it, wanted to see if you have any opinions on those points before I get started on it.

— Reply to this email directly, view it on GitHub https://github.com/gamalan/caddy-tlsredis/issues/42#issuecomment-1386308246, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIYFE7ZCM4F2VDQ43MX6YTWS46AXANCNFSM6AAAAAAT3PR35Y . You are receiving this because you commented.Message ID: @.***>

gamalan / caddy-tlsredis

Various points of feedback #42