rtk-query normalization and manual insert into cache

sebastian-dor commented 5 months ago

API has an endpoint that provides a list of items that we need. E.g. we have a map application that shows a lot of structures for a specific scenario id. We then want to use deletion, updating, selecting the items by their respective ID. The way RTK Query builds the cache (key) is by the function name in combination with the (scenario) ID. If one of the items would change, it would invalidate the whole list.

possible solutions:

A normalized cache, but https://redux-toolkit.js.org/rtk-query/usage/cache-behavior#no-normalized-or-de-duplicated-cache
onQueryStarted and using dispatch(api.util.upsertQueryData(...), but dispatch has quite the overhead and using it for more than a handful items slows down UI considerably
using the concept of a top-level-reducer that wraps the api reducer

the makeshift solution for 3.

Creating a so-called top-level reducer in the root reducer that wraps the actual call to the list endpoint and creates fake actions to insert the list items one by one into the cache.

export const rootReducer = {
[...]
  // [structuredataApi.reducerPath]: structuredataApi.reducer,
  [structuredataApi.reducerPath]: structuretlr,
 [...]
};

function structuretlr(state: any, action: any) {
  let newState: any = undefined;
  let intermediateState = structuredataApi.reducer(state, action);
  switch (action.type) {
    //query is done
    case 'structuredataApi/executeQuery/fulfilled': {
      //list endpoint (get all structures by scenario)
      if ('getStructuredataByScenarioId' == action.meta.arg.endpointName) {
        //const startTime = performance.now();

        let previousState: any = undefined;
        action.payload.forEach((element: { id: any }) => {
          const entryToAdd = [element];
          const idToAdd = element.id;
          //this is necessary as otherwise the fulfilled action won't insert
          let fakePendingAction = {
            type: 'structuredataApi/executeQuery/pending',
            payload: undefined,
            meta: {
              ...action.meta,
              requestStatus: 'pending',
              //not sure if this is necessary
              requestId: action.meta.requestId + '_1',
              arg: {
                ...action.meta.arg,
                requestStatus: 'pending',
                endpointName: `getStructuredataById`,
                queryCacheKey: `getStructuredataById(${idToAdd})`,
                originalArgs: idToAdd
              }
            }
          };
          //verbosity for debugging
          let stateToSet = previousState ?? intermediateState;
          previousState = structuredataApi.reducer(stateToSet, fakePendingAction);

          let fakeFullfilledAction = {
            ...action,
            payload: entryToAdd,
            meta: {
              ...action.meta,
              requestId: action.meta.requestId + '_1',
              // requestId: action.meta.requestId,
              arg: {
                ...action.meta.arg,
                endpointName: `getStructuredataById`,
                queryCacheKey: `getStructuredataById(${idToAdd})`,
                originalArgs: idToAdd
              }
            }
          };
          previousState = structuredataApi.reducer(previousState, fakeFullfilledAction);
        });
        //const endTime = performance.now();
        //const duration = endTime - startTime;
        //console.log('inserting ' + action.payload.length + ' took ' + duration + ' ms');
        newState = previousState;
      }
      break;
    }
  }
  if (newState === undefined) {
    newState = intermediateState;
  }
  return newState;
}

small benchmarks to show the difference between dispatching and insertion via root-reducer:

dispatching 381 took 1130.2000000001863 ms dispatching 991 took 6271.9000000003725 ms (note that dispatching here crashes the browser anyway)

inserting 381 took 247.5 ms inserting 991 took 1593.7000000001863 ms

Doing inserts like this for more than 1000 items is not viable that way as it significantly affects user experience. So for now I guess we'll chunk the requests to 1000 at a time. I'm still in the experimentation phase with the solution above, but I do have some ideas to improve the solution here but for now it should work.

Big thanks to @markerikson for help and support!

phryneas commented 5 months ago

As for this point:

onQueryStarted and using dispatch(api.util.upsertQueryData(...), but dispatch has quite the overhead and using it for more than a handful items slows down UI considerably

Did you enable the autoBatchEnhancer?

markerikson commented 5 months ago

fwiw I wouldn't expect that to help all that much, due to the number of distinct separate calls to Immer. (RTK 2.0 and Immer 10 might help some here, though.)

phryneas commented 5 months ago

Good point.

Tbh., your hack here looks interesting, but in the end, it's a hack.

At this point, you might be better off to use RTKQ for the "request" part, and maybe have your own normalized state that listens to RTKQ actions.

RTKQ will help in a lot of situations, but it isn't always a silver bullet if you really need normalized data.

markerikson commented 5 months ago

I actually came up with this hack over in Reactiflux :) and I did say it was a hack at the time, but it's also valid.

The right answer is that we ought to add a multi-item form of update/insertQueryData and do all this in one shot somehow.

sebastian-dor commented 5 months ago

Did you enable the autoBatchEnhancer?

I did indeed not (yet). I'll try and see if and how much it helps. I'm using Redux Toolkit Version 2.0.1. As for

Tbh., your hack here looks interesting, but in the end, it's a hack. RTKQ will help in a lot of situations, but it isn't always a silver bullet if you really need normalized data.

I'm well aware but since we're already using Redux within the application for menu/selection states and so on I'd like to stick with one state management tool. We already thought about implementing our own solution but since we're under quite some pressure time-wise we're looking to make amends with the tools we have available. And since having a normalized cache is not happening fast I'll just duplicate the cache entries and try to keep them in sync as good as possible.

The right answer is that we ought to add a multi-item form of update/insertQueryData and do all this in one shot somehow.

Indeed and since inserting the whole list is already very fast I guess there should be a way to do this.

One question that came up with my colleagues is if it would be possible to defer the cache insertion to a background task or a web worker, so it does not interfere with the UI thread?

markerikson commented 5 months ago

@sebastian-dor :

if it would be possible to defer the cache insertion to a background task or a web worker, so it does not interfere with the UI thread?

No, afraid not. The store exists on the main thread, so all updates have to be done there.

I would love to do some poking into this and see if I can figure out some way to speed it up, but I'm not sure how soon I'll have time to look at it.

phryneas commented 5 months ago

We could change from queryResultPatched to queryResultsPatched and allow an array of patches, which would make use cases like this easier by dispatching a single action.

It is an internal action, so we don't guarantee any api stability on it, we're not nailed to the current implementation.

markerikson commented 5 months ago

@phryneas yeah, that's exactly what I have in mind :)

The other issue here is that this particular example requires separate pending/fulfilled actions for each item. So, we're ending up with 2 * items calls to the API reducer. This is all within the root reducer, so there's still only one subscriber notification, but that does end up as 2N calls to Immer, so it's having to do all of its drafting and finalization work. I suspect that if it were all one single set of updates inside a single produce() call, there would be less overhead.

Sorta related to all this, the current upsertQueryData-type calls are full thunks and only handle one item at a time, and also not well suited for batch updates either.

All stuff I'd like to improve over the course of this year!

StealthySturgeon commented 5 months ago

The right answer is that we ought to add a multi-item form of update/insertQueryData and do all this in one shot somehow.

@markerikson is there an issue I can track for this action? It is something we are looking to adopt at work as soon as it becomes available. We currently upsert items just in time, but it would be nice to integrate that logic into our RTK Query api without dispatching a large number of actions.

markerikson commented 5 months ago

@StealthySturgeon I'm not immediately sure if we have a specific issue for this already. Per https://github.com/reduxjs/redux-toolkit/discussions/4107 , our next general step for RTKQ is to triage all the open issues and figure out what things people have asked for so we can start prioritizing the work. Another user has already done the work to fill out a spreadsheet listing all the issues and what each issue is asking for, but we haven't had time to go through it yet.

So, this issue will do for now, and if we end up shuffling issues I'll say something here :)

markerikson commented 5 months ago

Tagging here for cross-referencing:

someone has an external normalization lib they've tied into RTKQ, per:

4073

I have not had time to look into it yet, but will (eventually) when we dig into RTKQ stuff

markerikson commented 3 days ago

Tagging with the word normalization to aid in future searches by me :)

reduxjs / redux-toolkit