facebook / react

The library for web and native user interfaces.
https://react.dev
MIT License
229.85k stars 47.06k forks source link

[Compiler Bug]: Values used as indexes are not memoized #29172

Open guillaumebrunerie opened 6 months ago

guillaumebrunerie commented 6 months ago

What kind of issue is this?

Link to repro

https://playground.react.dev/#N4Igzg9grgTgxgUxALhAMygOzgFwJYSYAEAwhALYAOhCmOAFMEQG4CGANlAkQL4CURYAB1iROITA4ieTABMEADyIBeIosq0weZggBiWXAUz02nBHxFEi7BFOg5KUKatNcwAbRnyFAXQDclkQA9EH2js7ScooBojC2sMQAPLLaAHzAYU48iUEpzKkxPDEgPEA

Repro steps

In the code below, the call to expensiveFunction(value) does not appear to be memoized (it is preserved as is at the top level in the JS output).

This seems to be due to it being used as an index in let output = values[index], because when setting output to be index itself instead, it gets memoized properly.

Code:

function Component({ value }) {
  const index = expensiveFunction(value)
  let output = values[index];
  //output = index;
  return <div>{output}</div>;
};

Edit: Additionally, if index was manually memoized with useMemo, the compiler will remove the manual memoization, resulting in potentially much slower code than the original code.

How often does this bug happen?

Every time

What version of React are you using?

19

josephsavona commented 6 months ago

Thanks for posting. What’s happening here is that the compiler infers that expensiveFunction() returns a primitive value, which can be cheaply compared for changes. The compiler tries hard to only memorize what is strictly necessary to avoid cascading updates (parent re-renders, child re-renders, etc) and avoid memoization overhead in other cases.

In your case, is the function actually expensive enough to be a problem, or were you just surprised that the call didn’t get memoized?

guillaumebrunerie commented 6 months ago

I see, interesting.

In your case, is the function actually expensive enough to be a problem, or were you just surprised that the call didn’t get memoized?

I'm mostly surprised that it didn't get memoized. In the part of my code where I found this behavior, it isn't actually expensive to compute.

But I still find this behavior quite surprising. Just because a function returns a primitive value doesn't mean it is cheap to run, for instance we could have something like:

const expensiveFunction = (value) => {
  return someLargeArray.findIndex(v => JSON.stringify(v) === JSON.stringify(value));
};

which feels pretty reasonable to write but should definitely be memoized. But maybe it happens rarely in practice.

What I find the most surprising is that an explicit useMemo gets removed. I thought the compiler is designed to try to preserve existing memoization, or did I misunderstand something?

josephsavona commented 6 months ago

The compiler preserves existing memoization except where we can prove that the value being memoized is a primitive.

kaaboaye commented 6 months ago

I'm coming from https://github.com/facebook/react/issues/29580

Whether an object is cheap to compare or not is not a sufficient for deciding to skip memoization.

One of those numbers can be obtained via some very computationally expensive algorithm. If an app using correctly using useMemo to deduplicate those computations moves to React Compiler it may experience significant slowdown.

It is even worse for string template literals because if my understanding is correct each time we call foo() at least a couple allocations will be performed.

const foo() => `/foo/${1}`;
  1. Allocate '/foo/' string
  2. Call String(1) which allocates '1' string.
  3. Concatenate '/foo/' and '1' which is a third allocation.

So in my understanding of how string templates work they will always be more expensive to compute then dereference from cache them and check whether the reference has changed.

In conclusion. Unless you are absolutely sure that given value is a constant requiring no additional allocations and expensive computations it should always be memoized. Especially since react compiler strips explicit memoization.

Additionally I'm not sure whether it makes sense to distinguish between primitives and objects. Javascript compares values only by references. {a: 1} === {a: 1} evaluates to false but 'very long string' === 'very long string' evaluates to true even though the second one can be much more expensive to compare due to potentially requiring many dereferences. But I think it is a topic for another discussion.

njarraud commented 1 month ago

I am also confused by how the memoization of derived values is made by the compiler. Different coding styles yield memoization or not which doesn't really make sense to me. I would believe that it shall be memoized either way. It also means that user need to understand the compiler to achieve the desired results. It kinda defeats the purpose of introducing it in the first place.

Like mentioned in other posts, getting the derived value could be an expensive calculation that we want to avoid recalculating if another prop has changed.

In these examples, passes is an array of objects.

Inline array manipulation -> not memoized

const columnCount = passes.map(({ fields }) => fields.length).reduce((acc, v) => acc + v, 2);

// Compiler code
const columnCount = passes.map(_temp).reduce(_temp2, 2);
function _temp(t0) {
  const {
    fields
  } = t0;
  return fields.length;
}
function _temp2(acc, v) {
  return acc + v;
}

External component function -> memoized

const columnCount = count(passes);

// Compiler code
let t3;
if ($[4] !== passes) {
  t3 = count(passes);
  $[4] = passes;
  $[5] = t3;
} else {
  t3 = $[5];
}
  const columnCount = t3;

For loop -> memoized

let columnCount = 2;
for (const pass of passes) {
  columnCount += pass.fields.length;
}

// Compiler code
let columnCount;
if ($[4] !== passes) {
  columnCount = 2;
  for (const pass of passes) {
    columnCount = columnCount + pass.fields.length;
    columnCount;
  }
  $[4] = passes;
  $[5] = columnCount;
} else {
  columnCount = $[5];
}