whitfin / cachex

A powerful caching library for Elixir with support for transactions, fallbacks and expirations
https://hexdocs.pm/cachex/
MIT License
1.54k stars 97 forks source link

qlc error - max limit cache eviction fails for no cwd write permissions #329

Closed aerosol closed 3 weeks ago

aerosol commented 6 months ago

qlc uses the filesystem to store cursor information.

The result of qlc calls is discarded by Cachex, therefore it's possible to observe the following error logged - in case the current working directory is not writeable upon cursor/batch erase:

{:error, :qlc, {:file_error, ~c"appname-01_1_2222737.1", :eacces}}

Resulting with a silent failure of cache eviction.

I think the preferable approach would be to allow overriding qlc's {tmpdir, TempDirectory} configuration as per https://www.erlang.org/doc/man/qlc#common-options.

whitfin commented 6 months ago

@aerosol sure thing, want to PR it? Happy to include and get a patch out.

I do want to ask though, because I don't think I fully understand how it was silently failing. The assumption is that this is the block failing: https://github.com/whitfin/cachex/blob/main/lib/cachex/policy/lrw.ex#L141-L145

This is then simply passed through to QLC again: https://github.com/whitfin/cachex/blob/main/lib/cachex/policy/lrw.ex#L181

So is QLC accepting an error as an input and not logging anything? Or is the comprehension masking the error here? Just want to clarify how it's possible this is failing silently, because at a glance it looks reasonable?

Edit: note to self and future readers that this affects the cache max limit eviction, not general eviction.

aerosol commented 6 months ago

want to PR it?

Possibly, given enough capacity, will keep you posted.

So is QLC accepting an error as an input and not logging anything? Or is the comprehension masking the error here? Just want to clarify how it's possible this is failing silently, because at a glance it looks reasonable?

Ah yes, apologies for imprecise wording - it's not completely silent, in the sense a a one-off mismatch error happens in Cachex.Policy.LRW.erase_batch/3. Not much else besides that - the cache keeps happily growing over the limit.

image

Edit: note to self and future readers that this affects the cache max limit eviction, not general eviction.

Thanks, I'll update the ticket title.

whitfin commented 6 months ago

Awesome, thank you for the clarification - no worries if you're too busy to PR it, I can probably carve out some time tonight for this.

I've been trying to get a reproduction case but it's a little awkward, I guess I have to pre-package an app with Cachex to test it. I haven't yet been able to make the tmpdir option do anything (or at least visibly do anything).

Edit: As a quick patch it can be changed to just defer to System.tmp_dir!/0 and assume that something is writable (the worst case here is just the same as now, except with a better error). If we really need to make it configurable, that can be visited in future - I'd rather get a small patch out with a better default, and then add configuration later as needed.