medikoo / memoizee

Complete memoize/cache solution for JavaScript
ISC License
1.73k stars 61 forks source link

How to deal with functions taking dict args? #120

Open dheera opened 3 years ago

dheera commented 3 years ago

Functions that take a dictionary of named arguments is extremely common in JS, for example the entire AWS API does it:

s3.listObjects({Bucket: "some_bucket", Prefix: "some_prefix"})

Memoizing these functions using non-primitive mode results in effectively this comparison after an intended cache hit:

{Bucket: "some_bucket", Prefix: "some_prefix"} === {Bucket: "some_bucket", Prefix: "some_prefix"}

which even though the arguments are exactly the same, returns false because "===" does NOT do deep comparisons on dictionaries and only compares the pointers to the dictionaries, which are different here.

Using primitive mode doesn't work either, since it results in effectively this comparison (based on primitive.js lines 8-18):

Array.prototype.join.call([{Bucket: "some_bucket", Prefix: "some_prefix"}], "|") === Array.prototype.join.call([{Bucket: "some_bucket", Prefix: "some_prefix"}], "|")

which evaluates to

"[object Object]" === "[object Object]"

which although returns true, it is not what we want, because it will evaluate any two dictionaries as equal even if their contents are different.

What I really want is a deep compare, and I think that can be effectively accomplished for serializable arguments by using JSON.stringify instead of Array.prototype.join, which will serialize the dictionaries to strings instead of converting them to the literal string "[object Object]".

Let me know if you agree and I'll be happy to create a pull request, or let me know if you already have plans to resolve this use case another way.

dankolesnikov commented 10 months ago

@dheera I've faced a similar problem today when solving a data inconsistency bug, what I understand is that this library does not honor dict args however I was able to fix it by providing { length: false } argument like this: export const getData = memoizee(getDataRaw, { length: false });, This actually caused a pretty big problem for us because we were getting incorrect data.

@medikoo can you please shine some light here? Is this package actively maintained? If I open a PR to fix this would you able to review/approve and rerelease?

medikoo commented 10 months ago

@dheera @dankolesnikov I'm not sure how I missed this issue (btw, yes this package is maintained). Sorry for responding so late

Yes, currently there's no dedicated option for that, but it can be achieved relatively simply. See: https://github.com/medikoo/memoizee/blob/master/README.md#memoizing-on-dictionary-object-hash-arguments

dankolesnikov commented 10 months ago

thank you so much for a quick response @medikoo ! 1) I am curious why not support this functionality out of the box? is there a design intention around that? 2) Could you please explain why does this solution make things work? memoizee(getDataRaw, { length: false }); The function signature is of getRawData:

const getDataRaw = (effectiveDate: Date, filter: Record<string, any> | Knex.QueryCallback = {}) => {...}

Additionally, @dheera @medikoo I've realized that the cause for my bug stemmed from the fact that filter argument had a default value rather than being an object/dict, if I remove that everything works.

@medikoo The problem I face is that I am dealing with auto-generated code and I don't have good control of the arguments (and their ordering) of the functions I pass into memoizee, hence the solution you linked might not work well normalizer: function(args) { // args is arguments object as accessible in memoized function return JSON.stringify(deepSortedEntries(args[0])); }, here args[0] is problematic for me as I don't know the position of the arguments that's dict. Refactoring the entire code base to remove the default value is also not realistic but could you please explain why { length: false } makes everything work? That would be the easiest solution for me but it's critical I understand the implications of this parameter. Thank you again!

medikoo commented 10 months ago

I am curious why not support this functionality out of the box? is there a design intention around that?

I think it might be nice to support it out of the box, but when we speak of well designed generic solution, it may appear it's not trival to implement, e.g. we may want to go beyond JSON, to e.g. support well types as BigInt, etc. or recognise properly non JSON objects in collection (while after JSON stringification they're just cleared).

You've already noticed that plain simple solution as proposed in docs, is too optimistic for your case

So main problems I see is:

  1. It's time taking to make it right (have a well thought general approach).
  2. Current codebase is pretty old, and I'd prefer to refactor it into v1, before adding new functionalities (see: https://github.com/medikoo/memoizee/issues/73)

Could you please explain why does this solution make things work?

Sorry, I don't understand this question

here args[0] is problematic for me as I don't know the position of the arguments that's dict.

In that case I believe you need to normalize against whole args and not just args[0]

dankolesnikov commented 10 months ago

@medikoo Thank you so much for quick replies! I've turned export const getData = memoizee(getDataRaw); into export const getData = memoizee(getDataRaw, { length: false }); and it fixed all my problems with dict argument not working correctly but I don't fully understand why, documentation is not very clear. Can you please shine some light on what does { length: false } do under the hood?

medikoo commented 10 months ago

I've turned export const getData = memoizee(getDataRaw); into export const getData = memoizee(getDataRaw, { length: false }); and it fixed all my problems with dict argument

I see. length: false, changes the approach on what arguments are taken into account when memoizing.

If function signature is (effectiveDate, filter: {}, then memoization is evaluated against only first argument (as function length is 1), with false, memoizer always takes all passed arguments and evaluates on that. It's documented here: https://github.com/medikoo/memoizee#arguments-length