bvaughn / react-virtualized

React components for efficiently rendering large lists and tabular data
http://bvaughn.github.io/react-virtualized/
MIT License
26.38k stars 3.06k forks source link

CellMeasurer optimizations #341

Closed kof closed 8 years ago

kof commented 8 years ago

Hey, I have seen you did a great job already. I am now about to drop my own custom implementation of the measurer and port everything to the latest react-virtualized.

A few things I noticed are not good enough for my use case:

Row height caching.

Current row height caching is done using the index. However in my case, it is not possible to cache by index, I need a different caching mechanism based on custom id's.

Row element caching.

Currently you create a row element twice, once for measuring, once for representation. I can introduce a caching in the user code, so element creation happens only once, however you might want to solve this in the library.

bvaughn commented 8 years ago

Hey @kof,

Not sure how I could accommodate cache-by-id, since react-virtualized doesn't know anything about the underlying data (only that there's a collection of N items). Do you have a proposal in mind? Let's chat about it. :)

I understand the appeal of managing row-element caching in the base library, but I'm not sure how it would be possible given the HOC approach of CellMeasurer. Again I would be open to suggestions here, but at least caching in user land is an option for the time being. :)

kof commented 8 years ago

Row height caching in the user land is bit dirty right now, because there is no param for turning off the index based caching. One needs to call reset methods before every getRowHeight call.

kof commented 8 years ago

Element and size caching can be basically done in one caching logic and it is not very hard to do in user land anyways. So probably giving away that responsibility is not bad.

kof commented 8 years ago

One more option is to provide a separate HOC for caching, make it based on indexes by default, make this HOC cache elements and heights.

User can use such a HOC then as an example to implement any custom caching logic and create own caching HOC's.

bvaughn commented 8 years ago

Row height caching in the user land is bit dirty right now, because there is no param for turning off the index based caching. One needs to call reset methods before every getRowHeight call.

This hints at a problem though. You're implying that your underlying data (the ids) are not tied to index- and that the height varies per id. These 2 things would cause problems with react-virtualized since cell size (for a given index) is important.

That being said, react-virtualized is quick to recalculate its layout. The slow part is the rendering of a cell to measure its size. So if you were caching rendered cells (based on id) in user land, and you had a way to map index-to-id, then I'd think you should be able to tell react-virtualized to clear its cache without a significant performance hit (since the next set of measurements would be taken from your user land cache).

bvaughn commented 8 years ago

Another reason I shy away from caching the result of render operations is that it's unclear when I should purge the cache, which can lead to memory bloat. I think that's something that varies per-application, so it's another reason to leave it in user land.

kof commented 8 years ago

Yep, I agree, it may not fit all use cases and may need different caching strategies. So best thing would be to leave cache out of the library completely or make caching a separate optional HOC.

bvaughn commented 8 years ago

I like HOCs (obviously 😄 ) so I like that general idea.

Okay, so... I think we're settling on the fact that rendered-cell caching stays in user land (possibly with a HOC to help simplify it). But let's talk more about this row-height issue.

Can you explain to me why your data (ids) are shifting around within your collection (indexes)?

Also, if you were to be managing rendered-content-caching in user-land, then would that alleviate your performance concerns about CellMeasurer rendering you once for height and Grid (or whatever) rendering "again" for display? (Since the second render would be cached?)

You would still need to inform both that the underlying data (and thus the related cell sizes) may have changed though. (Grid has a method recomputeGridSize for this and CellMeasurer passes similar functions resetMeasurementForColumn and resetMeasurementForRow to its child function.

kof commented 8 years ago

Can you explain to me why your data (ids) are shifting around within your collection (indexes)?

My application is a chat with last-to-first message scrolling (reverse order). When I scroll back, index 0 becomes a different message than it was before.

Also, if you were to be managing rendered-content-caching in user-land, then would that alleviate your performance concerns about CellMeasurer rendering you once for height and Grid (or whatever) rendering "again" for display? (Since the second render would be cached?)

CellMeasurer would still mount the element to the dom and detect height/width on it using DOM properties. This is still dangerous because we never know when they might hit our performance, the only way is to ensure to touch dom as rare as possible.

You would still need to inform both that the underlying data (and thus the related cell sizes) may have changed though. (Grid has a method recomputeGridSize for this and CellMeasurer passes similar functions resetMeasurementForColumn and resetMeasurementForRow to its child function.

yep I am currently resetting both

      this.cellMeasurer.resetMeasurements()
      this.virtualScroll.recomputeRowHeights()
bvaughn commented 8 years ago

Ah, gotcha. Reverse order scrolling. Makes sense. :D

Roger that about the mounting/layout hit. Okay. So let's talk about the interface to this cache. Currently, CellMeasurer just keeps an object that maps column and row indices to cached sizes. I could imagine this being moved into a caching util that could be overridden via a prop. Maybe something with an interface like...

interface CellMeasurerCache {
  clearAllColumnWidths(): void;
  clearAllRowHeights(): void;
  clearColumnWidth(index: Number): void;
  clearRowHeight(index: Number): void;
  getColumnWidth(index: Number): Number;
  getRowHeight(index: Number): Number;
  hasColumnWidth(index: Number): Boolean;
  hasRowHeight(index: Number): Boolean;
  setColumnWidth(index: Number, value: Number): void;
  setRowHeight(index: Number, value: Number): void;
}

...then you could manage your own mapping of index-to-id in user land. Thoughts?

This interface would support a backwards-compatible release which is important to me. :)

I think this change would make this.cellMeasurer.resetMeasurements() and this.virtualScroll.recomputeRowHeights() very fast.

kof commented 8 years ago

Do you think we could have a caching interface for both, elements and height?

kof commented 8 years ago

For elements it would need to be a component, because it will need to do shallowEqual of data. However maybe there is a away to do it as a separate caching class too.

kof commented 8 years ago

In your interface example you have a typo with types.

bvaughn commented 8 years ago

Do you think we could have a caching interface for both, elements and height?

I thought we agreed to leave caching elements in user land (for the time being)? Since cache invalidation / expiration is kind of tricky to implement in the lib.

bvaughn commented 8 years ago

Although I assume your CellMeasurerCache implementation could also cache rendered cells.

kof commented 8 years ago

I thought we agreed to leave caching elements in user land (for the time being)? Since cache invalidation / expiration is kind of tricky to implement in the lib.

I thought we were talking about both, heights caching and elements caching are equally tricky. So when we are able to create a good interface for both and have an optional class handling it - why not.

kof commented 8 years ago

Although I assume your CellMeasurerCache implementation could also cache rendered cells.

Yes it does it already. I am just trying to simplify the whole thing if it is possible, because the logic for caching heights and elements seems to be mostly the same.

kof commented 8 years ago

What if we reduce the caching interface to something minimal and universal for any data? Kinda universal index based caching interface.

bvaughn commented 8 years ago

I thought we agreed to leave caching elements in user land (for the time being)? Since cache invalidation / expiration is kind of tricky to implement in the lib.

I thought we were talking about both, heights caching and elements caching are equally tricky. So when we are able to create a good interface for both and have an optional class handling it - why not.

They aren't equally tricky, actually. Caching numbers is a pretty lightweight thing. Caching rendered React elements would cause memory usage to climb much faster. (Basically the "expiration" problem.)

Although I assume your CellMeasurerCache implementation could also cache rendered cells.

Yes it does it already. I am just trying to simplify the whole thing if it is possible, because the logic for caching heights and elements seems to be mostly the same.

I think this is a reasonable thing to leave in user land. No reason you couldn't cache both with the same object, but I don't think I want to make it an official part of the interface. I don't think it's complex enough to need its own HOC anyway. It would just be a getter function and a map of index (or id, or whatever) to rendered object, right?

What if we reduce the caching interface to something minimal and universal for any data?

I think that would make it a bit harder to work with. For example, if it's a generic caching util, it would have an interface more like this:

interface Cache {
  clearAll(): void;
  clear(id: any): void;
  get(id: any): Number;
  has(id: any): Boolean;
  set(id: any, value: Number): void;
}

But at this point you wouldn't be able to use the same object to cache columns and rows. You'd have to have 2 (and a 3rd to cache rendered cells). Seems like this would make things more complicated on end users.

kof commented 8 years ago

They aren't equally tricky, actually. Caching numbers is a pretty lightweight thing. Caching rendered React elements would cause memory usage to climb much faster. (Basically the "expiration" problem.)

This is right. However if it is all replaceable, shouldn't be a big deal once it is a problem. If your concern is to cache elements by default in the lib - you could make it optional and false by default.

But at this point you wouldn't be able to use the same object to cache columns and rows.

Yep, you would need a cache instance per use case. I am not sure this is less nicer. The interface itself is nicer, because simpler names and less methods. When user doesn't want to have a custom caching implementation, library creates the cache object automatically, user doesn't need to do anything.

In the rare case where user wants to have a custom caching logic like me, he created an own caching class, now he will need to create 2-3 instances of it and pass them to proper component. It looks actually very nice to me, based on imagination). Also user could pass the custom caching class itself to the component and let component instantiate it.

bvaughn commented 8 years ago

I still prefer the simplicity of a single cache (mentioned here) for cell sizes. I think it fits better within the rest of the Api.

I think it's reasonable to leave rendered-cell-contents entirely in user land. I don't see a need for an explicit interface for this. There's only 1 connection point (the cellRenderer method itself, a getter). Given that the default behavior is (and would continue to be) no-caching* then I don't think there would be any additional methods to add. (Put another way, when would Grid ever call clear or clearAll for the rendered cell cache? When would it ever call has or set? It wouldn't.)

  1. Grid does temporarily cache cells when a user is scrolling. This cache is immediately flushed once scrolling stops though. This helps when performance is most needed without introducing any tricky expiration problems. (I purge the cache once the scroll is complete each time, so I don't have to worry about a growing memory footprint.)
kof commented 8 years ago

All right then, maybe my suggestions are over engineered, maybe it is enough to just make the heights caching optional in CellMeasurer.

bvaughn commented 8 years ago

Didn't mean to suggest they were over-engineered. I think you're focused on an application usage and I'm trying to think generically of library usage. We're just looking at it from different angles. :)

I am okay with adding a cache interface for cell sizes. Hopefully that will enable performance wins for your user case. :)

kof commented 8 years ago

Just found the real bottleneck, I was measuring performance of the entire _measureCell call, which is pretty bad in my case. I was thinking it is the element creation step, but it turns out it is the renderSubtreeIntoContainer call.

renderRow: 0.292ms renderSubtree: 17.716ms

bvaughn commented 8 years ago

Good to know. Caching heights will help a lot with this then, even if you didn't cache rendered rows. :)

kof commented 8 years ago

My problem is even worth, it is too slow on initial render, it spends 20-50ms on every message, after rendering 30, the slowness hurts the ux

bvaughn commented 8 years ago

Ah, worse? Hm. Any other ideas?

kof commented 8 years ago

Trying to find a way to not just reuse the element, but to reuse the whole mounted thing, so that there is almost no overhead for cell measuring.

bvaughn commented 8 years ago

Hm... this cache interface could actually provide an escape hatch to break out of the React ecosystem entirely... if your content is just text, you could use a single hidden/offscreen container to render into and measure...

bvaughn commented 8 years ago

Essentially you could just return true every time for hasColumnWidth and hasRowHeight and when asked for them, do a JIT render to this single, re-usable, offscreen container.

This would bypass most of the CellMeasurer really but... maybe it would help?

bvaughn commented 8 years ago

This sounds pretty hacky though. :/ Just tossing out some ideas.

kof commented 8 years ago

Wouldn't help in my case, because the row is a complex dom structure, rendering it without react would be tricky.

bvaughn commented 8 years ago

Ah, gotcha.

kof commented 8 years ago

One way is to get the dom subtree after react did render it and insert that subtree into the new place.

bvaughn commented 8 years ago

Is it the call to unstable_renderSubtreeIntoContainer and the subsequent unmountComponentAtNode that's slow? (Could we improve things by not mounting-and-unmounting between measurements?

kof commented 8 years ago

unstable_renderSubtreeIntoContainer itself is slow.

kof commented 8 years ago

Some bench samples:

renderRow is the cellRenderer function call. renderSubtree is ReactDOM.unstable_renderSubtreeIntoContainer cal.

renderRow: 2.481ms CellMeasurer.js:237 renderSubtree: 43.706ms CellMeasurer.js:246 unmountComponentAtNode: 1.465ms

CellMeasurer.js:224 renderRow: 0.618ms CellMeasurer.js:237 renderSubtree: 16.514ms CellMeasurer.js:246 unmountComponentAtNode: 3.390ms

CellMeasurer.js:224 renderRow: 0.504ms CellMeasurer.js:237 renderSubtree: 21.071ms CellMeasurer.js:246 unmountComponentAtNode: 2.768ms

kof commented 8 years ago

Just tried to remove unmountComponentAtNode just to see whats happens. Bench numbers became just fine, but measurements are messed up, digging into it.

CellMeasurer.js:224 renderRow: 0.231ms CellMeasurer.js:237 renderSubtree: 0.061ms

CellMeasurer.js:224 renderRow: 0.232ms CellMeasurer.js:237 renderSubtree: 0.064ms

CellMeasurer.js:224 renderRow: 0.234ms CellMeasurer.js:237 renderSubtree: 0.059ms

CellMeasurer.js:224 renderRow: 0.196ms CellMeasurer.js:237 renderSubtree: 0.062ms

bvaughn commented 8 years ago

Great! I'm at work at the moment so I'm not able to dig in much. Keep me posted! 👍

kof commented 8 years ago

The slowest part is ReactCompositeComponent.mountComponent. I found one specific component in my case which is very slow.

bvaughn commented 8 years ago

Still keep me posted, b'c I'm interested in any additional perf gains we can get with this component.

But I think the caching interface we described above would still help with your case (although it may not be enough by itself to get perf where you want it to be). But it still seems like an improvement. So for the time being, I'll move forward with that change (once I get the chance that is).

kof commented 8 years ago

Is there any reason for not using shallowEqual in CellMeasurer?

bvaughn commented 8 years ago

Oh! Looks like an oversight. I can fix that while I'm in here too.

bvaughn commented 8 years ago

Added in 0678d20. Going to try to finish #342 and then push both out as version 7.19.0.

bvaughn commented 8 years ago

7.19.0 just released. Enjoy!

kof commented 8 years ago

using the new cache interface now, works as expected.

kof commented 8 years ago

Using a FIFO cache for elements and heights ... not sure if it is the best but works for me so far.

bvaughn commented 8 years ago

Sweet! Great to hear. :)

Reversed list for chat interfaces is a pretty common request. So I'm glad to hear it's viable to do in a performant way. If you feel like sharing back any of your setup (via a recipe or something) let me know and I could add it to the docs to point others at. :)

kof commented 8 years ago

It has a bunch of custom stuff, need to think how to make it available.

bvaughn commented 8 years ago

Certainly. If you find the time though, and are willing, I'd be happy to help clean it up or anything required. I bet what you're doing now is probably superior to the little chat example I have in playground. It's not something I've ever needed to do in an actual app (yet) so I haven't spent much time on it. :)