jkomoros / card-web

The web app behind thecompendium.cards
Apache License 2.0
53 stars 8 forks source link

Missing-concept filters are extremely slow #580

Open jkomoros opened 2 years ago

jkomoros commented 2 years ago

Noticed while investigating #571

Looks like some layer of caching logic went away at some point and it's recomputing something again and again

jkomoros commented 2 years ago

There are a lot of recalcs of collections for the reference blocks even as of 4eb27c93bacde54825db82805171c02fd70f309e while doing a bisect... but those don't appear to affect performance that much?

jkomoros commented 2 years ago

The problem only appears to show up after you've edited a card in that filter to not include it any more?

jkomoros commented 2 years ago

suggestedConceptReferencesForCard must be missing cache hits in that case?

jkomoros commented 2 years ago

What's happening is missing-concept is a really expensive filter; it has to regenerate a bunch of upstream stuff the first time it's hit. And then whenever a single card changes even a little bit, then the entire workflow is thrown out and regenerated.

This is why switching to the concept card tab is expensive in the first place, and likely why editing /adding a concept card is excruciatingly slow (missing-concept shows up right here on concept cards and everything is slow for ~12 seconds while every single card is processed to look for missing concepts on them. This is definitely related to #571

A few ideas:

1) Make it so getConceptCardsFromCards has an extra memoization layer that, if the input has changed but the top-level result is deep-equal to the previous result, return the previous result so you get object equality for everything downstream.

2) Something about createObjectSelector for selectConceptCards or something

jkomoros commented 2 years ago

We'll need to also do similar memoization trick on conceptsFromConceptCards, because for #571 when a new concept card is added, or edited in any way the concepts will be the same (in most cases) but the actual concept cards results will be different.

jkomoros commented 2 years ago

04e00ce made a huge difference, but the representative test case of "load up a concept card, then click its "missing cards" collection, then edit a card in that collection and observe it's very slow" is still quite slow. Is there another layer of caching that's missing?

And also even if you make a trivial change to a concept card it still will kick off an expensive pipeline

jkomoros commented 2 years ago

suggestedConceptReferencesForCard takes a whole conceptCards, but really it just needs that for a concept -> card ID. conceptCards will change whenever any concept card changes even a teeny bit (very relevant for #571) but the actual strings -> conceptCardID will change very rarely. A function that takes in just the string to conceptCardID map would be useful for #571.

jkomoros commented 2 years ago

Test, how often is suggestedConceptReferencesForCard having a memoized cache miss?

jkomoros commented 2 years ago

Shouldn't normalizeNgramMap be a WeakMap?

jkomoros commented 2 years ago

Another reason f2d1ba0d3b6c72f8cd9e6d08de946e1ed96d7f30 led to a speedup, I think, is that selectEditingCardSuggestedConceptReferences used selectConceptCards as an input, which returned a different conceptCard map than conceptCardsFromCards, so the memoization was thrashing (between the suggested concepts for the editing card, and the one for all of the misisng-concepts) and not effective