phuslu / lru

High performance LRU cache
MIT License
182 stars 6 forks source link

cxlrubytes - fast zero alloc lru for []bytes only. i'll be honored to have your feedback on this lru i created for your comparison, remarks and comments #16

Open kolinfluence opened 2 months ago

kolinfluence commented 2 months ago

i'll be honored to have your feedback on this lru i created for your comparison, remarks and comments. hope to have your comments / input https://github.com/cloudxaas/gocache/tree/main/lru/bytes

you can try on the hit / misses etc which i have no idea how you would do the benchmark comparison. thx in advance.

i created it coz most lru need to predefined a capacity but i would prefer to be limited by memory size set.

i've been following your lru for some time and your analysis is the most comprehensive.

phuslu commented 2 months ago

yes, I also think about it for a long time, I called it "BytesCache" before, please allow me share some thoughts:

  1. where key&value is []byte, It's no doubt that the size should be memory usage rather than items count
  2. the indexed map type, map[string]int or map[hash]int? map[string]int pros: faster due to cpu-cache friendly cons: cost more memory map[hash]int pros: minimalized memory usage and gc overhead, that currently phuslu/lru using. cons: a bit slow than map[string]int, because it need fetch and compare key in another place. So which should be chosen? I'd like say it depends you :) But after swiss table is merged to go, see https://github.com/golang/go/issues/54766#issuecomment-1964278472 I think we should definably use map[string]int always. because it is obviously fast than rhh hashtable in large scale.
  3. potential minor improvements, such as BCE with unsafe(like phuslu/lru), struct alignment, etc.
kolinfluence commented 2 months ago

@phuslu hope to hv your contribution to the repo too. in fact hope u to have you as one of the maintainers.

phuslu commented 2 months ago

I still have interests the roadmap of https://github.com/phuslu/lru/issues/15#issuecomment-2044060169 although the last step of it (make it shareable) is too hard to complete, I still have motivation to try step 1&2 in my spare time.

Probably the cxlrubytes is simpler and faster choice in your case. But the rest parts(a mmaped and minimalized memory&gc bytes cache) of golang caches is still need someone(me) to make it done.

I regard it as a technical challenge on a larger scale, worth work it out but maybe no one will ever use it. 😄

kolinfluence commented 2 months ago

@phuslu i will use it. i found out my lru has caveats. it is slower if there are many puts and deletes. which will create alloc. not sure how to resolve. maybe you have better ideas.

step 3 is difficult to find an elegant solution. if you can help solve the alloc first. step 3... well, yeah. but i will definitely use it if you can get it done

phuslu commented 2 months ago

which will create alloc.

maybe there're is a implicit copy action in map[string]int -- just a guess.

kolinfluence commented 2 months ago

@phuslu i thought there was issue, it was with my test code. the code is fine. still need some test. just got it working a few hours ago.

it's usable and 100% zero alloc.

p.s. : my other test code was the one generating the alloc.

kolinfluence commented 2 months ago

@phuslu i updated batch eviction to speed it up else 1 by 1 too slow.

phuslu commented 2 months ago

Congrats, I guess you currently fingure out that a most suitable solution for your case.

BTW: considering https://github.com/cockroachdb/swiss impressive throughputs in large scale map, I think/guess the max performance solution of "lru as map" is forking the cockroachdb/swiss then add "prev/next" field to its entry to make it behavior like lru also. This is similar with what is elastic/go-freelru done before.

In particular, swiss no longer requires the bucket size to grow by a power of 2, which will avoid the “overcommit” issue of freelru.

phuslu commented 2 months ago

The overhead of generic function calls(seems that it cannot be inlined) prevents deeper performance optimization, which is why I plan to make one "BytesCache" or "MmapCache".

see https://github.com/golang/go/issues/54766#issuecomment-1969995760 and https://www.reddit.com/r/golang/comments/ts9neg/generics_can_make_your_go_code_slower/

kolinfluence commented 2 months ago

@phuslu thx for sharing insights, will wait for ur mmap version with eager anticipation. i've created what i need to solve what i need now though. hope of course mmapcache can be memory optimized as current high performance lru sort of waste lots of mem. somehow i'm confident you'll get it done.

kolinfluence commented 2 months ago

@phuslu updated and fixed cxlrubytes on some performance issues etc, it's now a bit faster.

  1. swiss map could be interesting. but i dont think i will add it in for quite some time because i've tried and hit a road bump
  2. do see if u can use anything from my code for the mmap cache u have in mind.
phuslu commented 2 months ago
  1. I have similar feelings about swiss, but my plan is to first remove its generics and turn it into a map[uint32]uint32 instance(I think this is relatively easier) then integrate to phuslu/lru. Although the performance on the CPU cache is degraded, it at least improves the performance of phuslu/lru on large numbers and retain the lowest memory usage.
  2. thanks, I will give a try soon.
kolinfluence commented 2 months ago

@phuslu i've tried map[uint32]uint32 here, it's faster but it's limited to 4 bil keys, there can be some use case etc. but for my use case, i prefer the map[string]int as it's more "secure".

anyway, for your reference: https://github.com/cloudxaas/gocache/tree/main/lrux/bytes

i'm looking forward to the mmap version actually for now, but that's beyond me coz i cant figure out a reasonably fast global lock for it.