ncthbrt commented 6 months ago

Summary

This RFC explores the potential of adding additional functionality to XYFlow to allow for more complex graphs while still remaining performant for smaller graphs and the general case.

Motivation

For larger more complex graphs, it is possible to achieve improved performance when zoomed in by only rendering the nodes that are visible within the viewport. Currently this is achieved by performing a naïve axis aligned bounding box check on each node & edge. This still requires that all node and edge data is available to the renderer and does come with a performance overhead, especially for smaller graphs and when zoomed out.

Similarly, the multi-selection rect uses similar logic to find nodes that intersect with the selection rect. The multi-selection rect also does not support direct selection of edges.

For advanced cases with a client/server architecture, it would be desirable to be able to lazily fetch data for a given region of the viewport.

For client architectures, it would be desirable to be able to speed up viewport intersection and selection rect performance, and to potentially offer the ability for direct box selection of edges.

Implementation

There is a class of data-structures known as BVH (Bounding Volume Hierarchies) that allow for fast, coarse grained intersection testing. BVHs are widely used in systems such as mapping software and in game development for rendering (including ray-tracing) and physics. These work by placing an axis aligned bounding box around each entity within the system and constructing a tree that contains these boxes in a manner that allows for efficient lookup. Insertion and removal has a time complexity of approximately $O(log_k(n))$, depending upon the data structure used. For larger graphs, this may represent a significant improvement over a naïve approach.

However this is not a given and would require profiling for specific use cases. Additionally baking in a particular structure and its inherent assumptions would negatively impact bundle size. Therefore the API offers a means for users to provide two functions, potentially asynchronous ones, that fetch the nodes and edges respectively within a particular bounding box. These are an overload of the already existing nodes and edges props. When the viewport is moved, or the selection rect is used, these functions are called. Additionally, the API allows the means of invalidating the cached value of a particular node or group of nodes, or even a region within the viewport. This has precedent in that there is already a useUpdateNodeInternals() hook.

This offers a third way of interacting with the XYFlow libs, which represents a middle-ground between controlled and uncontrolled flows. In this third way, the data store remains the authoritative source of graph information, while removing the expensive data integration step and allowing users to use intersection tests of their choice.

There is an additional option to optimistically update nodes when moved or connected which helps maintain interactivity in the face of slow-downs.

XYFlow provides adapters for particular BVH implementations and has created a pro example of a client/server architecture that uses these new APIs to lazily load data as needed. An additional pro example shows how to implement collision resolution using these APIs that reuses the BVH provided by the BVH package.

May also want to add option to configure padding of viewport, so that panning and zooming don't necessarily always trigger a refetch.

Drawbacks

Adding a third way to interact with XYFlow libs may introduce additional complexity for users
Adding a third way to interact with XYFlow libs may introduce an additional maintenance burden.
Maintaining internal state becomes more complex as cache invalidation needs to be handled.
Optimistic updates may introduce additional complexity in internals.

Alternatives

Intersection Function

This alternative narrows the scope of the feature to simply provide an function that returns the set of node/edge ids that intersect with a given bounding rect. This would be a lot simpler to implement, however it has the downside that it may not provide enough of a performance boost for large graphs to make the implementation lift worth it.

Allow nodes and edges to come and go

Implementors could provide a transient set of nodes and edges based on viewport position. To implement this, it would need to be validated that nodes and edges can arbitrarily be removed or added to the set of nodes and edges without catastrophic performance hitches. Additionally this would not solve the problem of maintaining user interactivity in cases where updates to the store are slow or asynchronous. To solve that in user land, it would require a similar effort to implementing it once in the library.

Bake in BVH support into the library

Has similar issues to the intersection function alternative.

Unresolved Questions

Open Question: Ordering may be tricky with optimistic updates and cache handling and invalidation

Open Question: How to document and communicate these changes effectively to users

Open Question: How to effectively support this across all packages

Open Question: MiniMap support?

Open Question: Selection-rect support for edges. Implement parser for edge renderer?

peterkogo commented 6 months ago

Thanks so much for the RFC! I am just going to dump a couple of thoughts here.

Performance metrics to check out for Spatial Querying implementation: initial build time, single update, bulk update, point query, rect query, memory
I know BVH trees from 3D game engines, where you have a lot of overlapping bounding boxes and the goal is mostly ray intersections. It might be interesting to see how different spatial data structures behave in this regard as flow graphs are usually a little further spaced out. But maybe this becomes irrelevant when edges are also taken into consideration... Do you have a good resource/information/experience on how in our case BVH might be the way to go?
In my head it would work like this:
- maintain an AABB in internal node
- build up a BVH tree in adoptUserNodes or updateNodeInternals
- query the BVH when viewport moves to render visible nodes (make mounting/unmounting of nodes faster to prevent lag)
- (maybe) only update the tree after node dragging has finished
- expose interface for querying the BVH tree
If I look at the performance of javascript BVH implementations I don't think an async function is really needed and it might degrade performance as async does not come for free and complicates computation ordering. Plus some of that work has to be done anyway when updating node dimensions/positions.

Is there a really a use case for exposing anything more, even have a cache?
I think if you'd want to lazy load nodes you'd have to load all nodes and edges (maybe not all edges) in advance with no information attached to them. And then implement a placeholder node that fetches the information when it gets into view/big enough. You could also create a way to batch the calls from these nodes. But I feel like this can be implemented implicitly and does not require more BVH API surface area.
This would also pave the way for canvas based edges

moklick commented 6 months ago

Hey @ncthbrt

thanks for this detailed RFC!

Even if React Flow wasn't built for huge graphs, I really like the idea of having a better support and a better performance in general. In v12 we already improved the performance for dragging nodes in bigger graphs, but the onlyRenderVisibleElements implementation wasn't touched.

It was always important to us, that React Flow is flexible and adjustable. Maybe it's an option here to expose some functionality, so that users can implement their own intersection algorithms (just yesterday someone created an issue #4272 that goes into that direction). For now we are using getNodesInside for the selection rectangle and onlyRenderVisibleElements. Would it be possible for you to use a BVH if you could overwrite that function somehow? It would be nice if we could find a better implementation than the current naive approach and give users the option to overwrite that functionality. What do you think about that? I would like to start with a small change. In my view it would make sense to concentrate on the virtualization topic first.

Thanks @peterkogo, what do you think about this? Would it be possible to expose some functions to be able to implement a more performant virtualization strategy?

ncthbrt commented 6 months ago

@moklick It would be possible to do that I think. I can work on a POC to that end?

Thanks for the comments @peterkogo

moklick commented 6 months ago

That would be great, but let's wait for @peterkogo feedback here. I know he is also interested in the topic and maybe he already started something!

Would you like to make a POC for a better built-in virtualization or expose functionality or both?

To answer you open questions:

Ordering may be tricky with optimistic updates and cache handling and invalidation

Not needed for now when we focus on virtualization first

How to document and communicate these changes effectively to users

We should try not to introduce breaking changes, but only more options. We could create a page under /learn where we explain how to use a quadtree or something like that with the new API.

How to effectively support this across all packages

Let's focus on React first.

MiniMap` support?

I think it would make sense to implement an alternative canvas based minimap at some point but I would like to postpone this task.

Selection-rect support for edges. Implement parser for edge renderer?

In my view a naive approach via nodes is enough (if one connected node is visible, an edge is visible too)

ncthbrt commented 6 months ago

Happy to do that too!

peterkogo commented 6 months ago

@ncthbrt I talked to @moklick about how to move forward on this.

You submitted this RFC at an undoubtably tumultuous time here at xyflow :) Just to give a bit of context, we have changed quite a lot about the inner workings of React Flow for the next release and are in the middle of a rewrite of Svelte Flow. There is one (hopefully) very last thing we would like to get into v12, which would influence (and simplify) this RFC immensely - namely how we handle node origins.

Just as a rough timeframe, I will create a PR with the node origin changes in the next 2 weeks, so we can move forward on this afterwards. I'll keep you posted!

ncthbrt commented 6 months ago

Thanks. That sounds promising!

On Mon, 13 May 2024 at 17:06, peterkogo @.***> wrote:

@ncthbrt https://github.com/ncthbrt I talked to @moklick https://github.com/moklick about how to move forward on this.

You submitted this RFC at an undoubtably tumultuous time here at xyflow :) Just to give a bit of context, we have changed quite a lot about the inner workings of Rect Flow for the next release and are in the middle of a rewrite of Svelte Flow. There is one (hopefully) very last thing we would like to get into v12, which would influence (and simplify) this RFC immensely - namely how we handle node origins.

Just as a rough timeframe, I will create a PR with the node origin changes in the next 2 weeks, so we can move forward on this afterwards. I'll keep you posted!

— Reply to this email directly, view it on GitHub https://github.com/xyflow/xyflow/issues/4239#issuecomment-2107908336, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCJEWI72DG2LBQWHMAXKC3ZCDJGVAVCNFSM6AAAAABHDN3TYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBXHEYDQMZTGY . You are receiving this because you were mentioned.Message ID: @.***>

peterkogo commented 5 months ago

I made a shallow deep dive into acceleration structures for spatial queries so here are my 2 cents.

Due to the declarative nature of React Flow (pass in edges, get out fancy graphs), we pretty much have to expect nodes and edges to change 100% in between updates. Whatever algorithm we chose in the end, fully rebuilding the structure should be fast. To further extend on this point, we can substantially simplify the implementation if we manage to do this in a stateless fashion and without additional heuristics (e.g when do we just update vs fully rebuild).

I did some preliminary testing on implementing Quadtrees, a BVH and using the libraries rbush and flatbush.

Quadtrees

Pretty fast to construct, but you end up with a lot of inefficient trees for various situations
does not work so good if the full extent you'd like to subdivide is not known from the start (which might happen if we don't want to iterate over all nodes before adding)
managing a lot of empty space as well as managing node density, is both influenced by tree depth
deciding on what depth and how many nodes per leaf to choose can vary depending on type of flow graph

BVH

surprisingly complex and a lot of work involved in optimizing the partition algorithm
might take a little longer to construct
you end up with a pretty good tree very easily

rbush & flatbush

bulk insertion is really fast
almost optimal tree
it can deal with a lot of weird spatial distributions quite well
it's a tested library and just weighs around 3kb
flatbush is only really faster for tens of thousands of nodes (when fully rebuilding)

So thats that. I am very inclined to just use rbush - it should be noted however that its currently in a broken state in terms of build dependencies (already opened a pull request) & es module support (forking might be the easiest route, not sure if library will be updated...)

I will release a small benchmark repo for these things soon, however have to deal with js module issues 🥲 first...

If anyone has some libraries I am unaware of or some comments on my preliminary findings feel invited to join the conversation!

fredericrous commented 5 months ago

This RFC initiative looks exciting. I have a use case that I wonder could complicate the algorithm:

How about nodes that grow?

For context, on my project I have a diagram that represents the sitemap of a website. I drag and drop elements onto the nodes, that make the nodes grow and I then trigger a relayout with d3-flextree and d3-hierarchy

peterkogo commented 5 months ago

@fredericrous this RFC would support dynamic changes to node size & position. Should be no problem.

ncthbrt commented 5 months ago

Nice work on the POC @peterkogo.

Still want to experiment with improving performance beyond culling nodes but not in a huge rush to do so right now, so this could be a great intermediate win.

moklick commented 5 months ago

Thanks for the deep dive and the detailed explanation @peterkogo!

We could fork rbush and publish it under @xyflow/rbush. If your PR mourner/rbush#138 gets merged, we can replace the dependency with the original rbush package again.

Do I get it right, that the rough process would be:

build a new tree when nodes come in (in adoptUserNodes for example)
use tree for culling (onlyRenderVisibleElements)
use tree for node selections

Questions:

do we want to export helpers to query it?

peterkogo commented 5 months ago

That sounds about right! We can expose functions for finding intersections for a specific node, finding intersections for an arbitrary rectangle and finding all colliding nodes.

ncthbrt commented 5 months ago

Would edges also be included in this?

moklick commented 5 months ago

Currently we create a box of the connected nodes. We could do the same here and create a tree for the edges too. wdyt @peterkogo

peterkogo commented 5 months ago

@ncthbrt what other ideas do you have?

Still want to experiment with improving performance beyond culling nodes [...]

Some ideas from me: add a inView prop to custom nodes & edges, so it's easier to lazy load things. Maybe have some kind of padding to load things that are slightly out of view as well.

Would edges also be included in this?

I would include edges sooner or later because implementing hover effects on canvas based edges will require this eventually. ~~Though for culling only, having a fast way to determine if either the source or target node is in view would already come a long way.~~ edit: this does not work. Just use a bounding box for edges as well.

Edit: edges will probably have a separate tree for various reasons

ncthbrt commented 5 months ago

The biggest bottleneck that I've measured for manipulating large, zoomed out graphs has been the store update logic (I'm using yrs, a Rust port of yjs) and updating a thousand or more nodes in the store within a transaction can be quite costly. My hypothesis is that a tiered/hierarchical approach to updates would be appropriate for slower stores. In this hierarchy, responsiveness would be maintained by optimistically updating properties in an internal store until an appropriate idle point is reached at which point the updates are written back. This would also reduce the amount of time spent on integrating changes as updates would be batched.

Some ideas from me: add a inView prop to custom nodes & edges, so it's easier to lazy load things. Maybe have some kind of padding to load things that are slightly out of view as well.

That is a good idea as well!

My idea is tangent to this issue however

peterkogo commented 4 months ago

I have some updates from my investigation I'd like to share.

On accelerating spatial queries

First and foremost, RBush and Flatbush turned out to be delightful libraries. Overall, Flatbush performed slightly better than RBush in terms of memory usage & garbage collection delays — even though it required am additional mapping of node IDs to Flatbush indices. Both libraries needed to be extended by a function determining full intersections.

However, measuring the performance impact of building these trees (~1ms with 2500 nodes on M3 Apple Silicon) I realized that we started our optimization journey on the wrong end, really. With our current naive method, determining which of the 2500 nodes is in view takes roughly 0.25ms (I'm re-discovering every time how astonishingly fast computers can iterate over arrays), which is a fraction of the 80ms it takes to re-render all the elements.

The very same bottleneck applies to the lag observed with onlyRenderVisible enabled or while drawing selection boxes: the biggest culprit is mounting/unmounting/updating React components. What a bummer…

No need to fret

Well, what do we make of this?

For one, the virtualization part of this RFC would be much more impactful, than focusing on faster intersection algorithms. onlyRenderVisible for instance, provides quite a boost in performance when interacting with big flows with partially visible nodes & edges. The only downside is, that moving the viewport tends to be laggy as nodes & edges need to be constantly mounted and unmounted while coming into view. Having a way to render placeholder nodes (possibly canvas based) while moving the viewport and only rendering the actual nodes in their full glory when the interaction has stopped, would make it a far more recommendable option. One limitation of this approach, however, is that nodes and handle positions need predetermined dimensions and positions for this to work. As a small consolation, you would receive SSR for free.

Further, having the ability to render the minimap and edges on a canvas, would pose another possibility to bypass reacts lifecycle overhead. Tradeoffs of canvas-based approaches apply.

Conclusion

This was quite valuable research, thanks for providing the impulse. We will most definitely use Flatbush in the future for supporting pointer interactions on canvas elements, and we will also look into possibilities for delaying certain state updates during interactions. This helped us quite a lot in identifying more fruitful optimization paths.

peterkogo commented 4 months ago

My message sounds unintentionally very conclusive. We will continue working on this.

ncthbrt commented 4 months ago

I agree with this. And mirrors my performance profiling

I have some updates from my investigation I'd like to share.

On accelerating spatial queries

First and foremost, RBush and Flatbush turned out to be delightful libraries. Overall, Flatbush performed slightly better than RBush in terms of memory usage & garbage collection delays — even though it required am additional mapping of node IDs to Flatbush indices. Both libraries needed to be extended by a function determining full intersections.

However, measuring the performance impact of building these trees (~1ms with 2500 nodes on M3 Apple Silicon) I realized that we started our optimization journey on the wrong end, really. With our current naive method, determining which of the 2500 nodes is in view takes roughly 0.25ms (I'm re-discovering every time how astonishingly fast computers can iterate over arrays), which is a fraction of the 80ms it takes to re-render all the elements.

The very same bottleneck applies to the lag observed with onlyRenderVisible enabled or while drawing selection boxes: the biggest culprit is mounting/unmounting/updating React components. What a bummer…

No need to fret

Well, what do we make of this?

For one, the virtualization part of this RFC would be much more impactful, than focusing on faster intersection algorithms. onlyRenderVisible for instance, provides quite a boost in performance when interacting with big flows with partially visible nodes & edges. The only downside is, that moving the viewport tends to be laggy as nodes & edges need to be constantly mounted and unmounted while coming into view. Having a way to render placeholder nodes (possibly canvas based) while moving the viewport and only rendering the actual nodes in their full glory when the interaction has stopped, would make it a far more recommendable option. One limitation of this approach, however, is that nodes and handle positions need predetermined dimensions and positions for this to work. As a small consolation, you would receive SSR for free.

Further, having the ability to render the minimap and edges on a canvas, would pose another possibility to bypass reacts lifecycle overhead. Tradeoffs of canvas-based approaches apply.

Conclusion

This was quite valuable research, thanks for providing the impulse. We will most definitely use Flatbush in the future for supporting pointer interactions on canvas elements, and we will also look into possibilities for delaying certain state updates during interactions. This helped us quite a lot in identifying more fruitful optimization paths.

xyflow / xyflow

RFC - Spatial Queries & Virtualisation in XYFlow #4239

Summary

Motivation

Implementation

Drawbacks

Alternatives

Intersection Function

Allow nodes and edges to come and go

Bake in BVH support into the library

Unresolved Questions

Quadtrees

BVH

rbush & flatbush

On accelerating spatial queries

No need to fret

Conclusion

On accelerating spatial queries

No need to fret

Conclusion