isaacs / node-lru-cache

A fast cache that automatically deletes the least recently used items
http://isaacs.github.io/node-lru-cache/
ISC License
5.38k stars 353 forks source link

Add method #259

Closed falsandtru closed 1 year ago

falsandtru commented 1 year ago

If you got no value and set a value for the same key, get method doesn't have to check key registration again. That get method is actually add method. Thus the next logic is available.

cache.get(k) ?? cache.add(k, v);
isaacs commented 1 year ago

So you're looking for a sort of "set if not already set" method?

falsandtru commented 1 year ago

"set if not already set"

No. Set an entry without checking the key registration because I already checked it.

isaacs commented 1 year ago

I'm not sure what you mean by "checking the key registration". That's just a single Map lookup, and it's to get the internal index, there's no way that you "already checked it", because it's an internal implementation detail.

Can you please elaborate on what exactly you need, and what benefit it would provide?

falsandtru commented 1 year ago

The next slow process in c.get(k) ?? c.set(k,v) is omittable by changing to c.get(k) ?? c.add(k, v).

https://github.com/isaacs/node-lru-cache/blob/a63ce28eacff77dfd30f4f6d62adcd361d8228ab/index.js#L601

isaacs commented 1 year ago

So you're saving the cost of one Map.get(), and adding the hazard of potentially breaking the linked list by having an orphaned index that isn't referenced by any key? If you get that wrong, the data will be corrupted. That is a terrible idea.

falsandtru commented 1 year ago

This is definitely a reasonable tradeoff but you can make the decision to continue to use a wasteful API.

isaacs commented 1 year ago

If you really want to unsafely avoid the check in set(), you can use the internal methods and properties that set and get are using. They are unsupported, and may change subtly and incompatible between versions, so it would be best to pin the version and be very careful about upgrading, but they are exposed on the class.

isaacs commented 1 year ago

It would be possible though, to add a method that sets if unset, or returns the currently set value, with a single check and no exposed hazard. That was what I was saying in my first comment.

The logic would be something like:

getSet(key, valIfUnset) {
  if (this.has(key))
    return this.get(key)
  else {
    this.set(key, valIfUnset)
    return valIfUnset
  }
}

But obviously without repeating the index lookup, so the methods would be unrolled.

falsandtru commented 1 year ago

If you really want to unsafely avoid the check in set(), you can use the internal methods and properties that set and get are using. They are unsupported, and may change subtly and incompatible between versions, so it would be best to pin the version and be very careful about upgrading, but they are exposed on the class.

Add method is its stable version.

It would be possible though, to add a method that sets if unset, or returns the currently set value, with a single check and no exposed hazard. That was what I was saying in my first comment. getSet(key, valIfUnset) {

It is strange that you already have the value you want to get.

isaacs commented 1 year ago

cache.get(k) ?? cache.add(k, v); It is strange that you already have the value you want to get.

Yes, I agree. But isn't that implicit in the code you shared? Ie, instead of cache.get(k) ?? cache.add(k, v), you'd do cache.getSet(key, v), and both would resolve to either the value already in cache, or the new value v that you add if it's missing.

Unless perhaps I'm misunderstanding the logic you intend by cache.get(k) ?? cache.add(k, v)?

Add method is its stable version.

A method that sets without ensuring that the key is not already present is completely unsafe, though. I would regret adding and having to support such a hazardous method. So no, that's not happening, sorry.

But a getSet(k, v) method would be possible, and even a pretty naive implementation does speed things up a hair from the c.get(k) ?? c.set(k, v) or c.has(k) ? c.get(k) : (c.set(k, v), v) approaches, so that's maybe useful? I just don't know how much value there is in speeding up an operation by a mere 13% when it's an operation that's already on the order of 100ns. Probably not worth the effort, tbh. If that's the slow bit in your program, you'd be better off rewriting in Rust or C++.

https://gist.github.com/isaacs/4d700a8b80a3f898bc25c1f34aa84dda

falsandtru commented 1 year ago

Unless perhaps I'm misunderstanding the logic you intend by cache.get(k) ?? cache.add(k, v)?

Of course you are misunderstanding. The difference is cache.get(k) ?? cache.add(k, v) is splittable but cache.getSet(key, v) is not. Therefore cache.getSet(key, v) cannot implement the next optimization. You are not thoughtful.

function memoizeDict<as extends [unknown, ...unknown[]], z, b = as[0]>(f: (...as: as) => z, identify: (...as: as) => b, memory: Dict<b, z>): typeof f {
  let nullable = false;
  return (...as) => {
    const b = identify(...as);
    let z = memory.get(b);
    if (z !== undefined || nullable && memory.has(b)) return z!;
    z = f(...as);
    nullable ||= z === undefined;
    memory.add?.(b, z) ?? memory.set(b, z);
    return z;
  };
}

A method that sets without ensuring that the key is not already present is completely unsafe, though. I would regret adding and having to support such a hazardous method. So no, that's not happening, sorry.

Unsafe high performance APIs that omit validation are common in low-level APIs, sorry.

speeding up an operation by a mere 13%

In general it is a very good improvement but you seem to think differently.

I'm glad to see that you made the decision to provide only a slow API. Well, even if you add the Add method, my implementation is already 2x faster than your lru-cache, sorry.

isaacs commented 1 year ago

Well, sure, a function that just memoizes to a Map is of course going to be way faster than an LRU cache because it's not doing LRU tracking, cache eviction, or staleness checking.

An interface for memoizable functions, sort of a "synchronous fetch()" would perhaps be useful. But it's still of course going to be slower than just caching everything and not evicting based on use recency.

You are not thoughtful.

You are being rude and this conversation is over.