Seneral / Node_Editor_Framework

A flexible and modular Node Editor Framework for creating node based displays and editors in Unity
https://nodeeditor.seneral.dev
MIT License
2.01k stars 414 forks source link

Memory issue in recursive search in Node.cs #155

Closed Physicalpariah closed 5 years ago

Physicalpariah commented 6 years ago

Foreach loops generate the tiniest amount of memory per loop. When a node graph gets large and a node is connected towards the end or at the end of the graph, it generates gigabytes of garbage for the GC to deal with.

Fyi we're using canvases with 500+ nodes.

Replacing these with for-loops deals with this issue.

for reference the following methods are the culprits:

Node.isChildOf() Node.isInLoop() Node.allowsLoopRecursion()

Hope this helps!

Seneral commented 6 years ago

Sorry for the late reply. Will take a look at this. thanks for reporting!

Seneral commented 6 years ago

Implemented in branch Optimizations. There are still some other minor performance improvements possible though... Anything else you noticed with such a big canvas? Really interested, have not tested myself with it yet. If there's anything in specific that takes forever I'd like to know:)

Physicalpariah commented 6 years ago

When I get a chance I’ll compile a full list, but here’s a couple noteworthy data points:

Performance is generally slow, (< 3-4 FPS on a brand new, specced out iMac) profiler reports that the sheer quantity of text labels and connection drawing is bringing performance down. 

At a guess I’d imagine the reflection based zooming isn’t helping either.

Probably number one on the list of things to work around the performance issuers is to either add in some kind of “camera culling” or sub canvases.

On Fri, 16 Feb 2018 at 2:47 am "Levin G." < ">"Levin G." > wrote:

Implemented in branch Optimizations ( https://github.com/Seneral/Node_Editor_Framework/tree/Optimizations ). There are still some other minor performance improvements possible though... Anything else you noticed with such a big canvas? Really interested, have not tested myself with it yet. If there's anything in specific that takes forever I'd like to know:)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub ( https://github.com/Seneral/Node_Editor_Framework/issues/155#issuecomment-365968007 ) , or mute the thread ( https://github.com/notifications/unsubscribe-auth/AFGJ6GeA3bRzuYyz9WrhBbHbhdYLaKwKks5tVFGdgaJpZM4R2sjl ).

Seneral commented 6 years ago

Zooming has a fixed cost (it doesn't scale with the amount of nodes on the canvas) and it has been optimized to the extend that is is about as costly as a delegate call, so no worry there.

Culling is definitely the way to go though:) Unity IMGUI is definitely not made for tenthousands of elements because by default, there is only so much space on the screen...

SubCanvas is actually pretty easy. Need some default Input/Output nodes you can place in your canvas and make a separate node that takes a canvas and provides parameters based on these input/output nodes in it. Some tricks to be done but I think I can do that:) Previously I experimented with subcanvas in the sense of that you can actually EDIT it inside a node, but aborted that since it was useless and technically very hard to do.

Seneral commented 6 years ago

Implemented culling d483381, should improve it alot for you. Sorry for super late response btw. Connections still are calculated (not drawn), including all bezier points. So still not optimal, but even on a semi-big canvas it is already greatly noticeable! :)

SlimeQ commented 5 years ago

@Seneral That Optimization branch is a 404 now, does that mean the relevant changes are merged into develop?

I'm running into this problem after 32 connected nodes. I've got a drastically modified version of the plugin running which uses nGUI, so I'd expect lower performance but this is insane. On connecting the 33rd node I get an infinite hang and a massive memory leak, which will fill up my 32gb RAM and then move on to my page file and sometimes crash my display drivers and kill Unity completely. I finally thought to attach to a debugger and I found it's hanging on Node.BeginRecursiveSearchLoop().

Suffice to say I really just need the relevant code changes since my version of the plugin is nearly a rewrite.

EDIT: Just realized that my particularly low limit (32 nodes) is because the particular nodes have 2 inputs/outputs that are chained together. Here's a pic: doublelink

(Yes, this looks different but the underlying logic is all from this repo)

As it turns out, this situation results in O(nodeCount ^ connectionsBetweenNodes) complexity which is O(32^2) which is O(1024). With single linked nodes in a chain I can make 1000+ nodes. Because isChildOf iterates on each input port of the node it ends up running up the chain twice for each node in the chain. I don't have a fix yet, but I think it just needs to somehow be more intelligent about which ports to follow.

Furthermore, I"m not totally sure the isChildOf check is even necessary when simply adding another node to the chain, as it's not really possible to have input and output both connected at that point. If the purpose of this function is just to detect loops (which it is, as I understand it), I think we could bypass the isChild function completely if either all inputs or all outputs are empty.

Seneral commented 5 years ago

Hi, will have to check it out tomorrow or at the weekend. Late here. But I can already say it'll need a refactor, that's for sure:) Didn't realize it's that severe, never had such a large slowdown before. I wrote this code when I was still relatively early into programming, should be able to make a way more efficient algorithm now.

Btw, your editor looks really nice!

Seneral commented 5 years ago

Ok just reviewed the related code and did some (relatively) minor optimizations, can't find a clear reason for the huge hang and memory leak for now. Worked on my tablet PC though, will try a proper memory analysis later on my main PC. To clarify, the system with (Begin/End/Stop)RecursiveSeachLoop are just helper functions that serve to NOT visit the same nodes twice. IsChildOf does the job it is designed to do and does not visit nodes needlessly, since it follows the outputs in a tree-like manner to try to find the other node, but will abort once it finds an already visited node. So I guess it is O(Number of 'ancestor' nodes), not accounting for the fact that it needs to search the list to find out if a node has already been visited, and that it aborts early when it found a loop.

EDIT: Found the bug, it is indeed the recursive loop but not in the way I thought. Theoretically it is not bad, but it is when it doesn't actually work. I forgot to make the startNode static which means that abort condition never triggered! So, I guess that is the reason. Happens when you don't fully test it. Sorry about that. Will push all changes soon.

SlimeQ commented 5 years ago

@Seneral Thanks for the response. I'm working full time on this app so I'll be pounding away at for the foreseeable future, I'll let you know if I come up with anything.

I actually don't think my issue is a memory leak. I did before because I was getting massive memory spikes when duplicating nodes, but after upgrading from 2018.3 to 2018.4 that is not an issue at all. I really have to push to get the profiler past 50mb allocated now. However this hang still occurs, which makes me think it's an infinite loop/CPU issue.

Here's what I'm looking at on my end: single_linked_graph single_linked_log

This is a very basic chain of 5 single linked nodes and my log output when I connected the 5th node to the chain. As you can see I get 4 calls to isChildOf, two calls to ClearCalculation, and two calls to ContinueCalculation. None of this is particularly surprising.

Here's the same thing with a double link: double_linked_graph double_linked_log

The log output you see here is from when I connected the 2nd link on the 5th node. We get 15 calls to isChildOf, several of which are repeating. You can't see it in the image but 0 out of 15 of these calls are aborting at the if (BeginRecursiveSearchLoop()) return false; line.

Now, here's the exact same situation on an algorithm I came up with: my_algorithm_log

Now we get only 4 calls to isChildOf. Here's the algorithm I'm using:

public bool isChildOf(Node otherNode)
{
    Debug.Log("["+Title+"::"+NodeIndex+"].isChildOf("+otherNode.Title+"::"+otherNode.NodeIndex+")");

    if (otherNode == null || otherNode == this)
        return false;

    if (BeginRecursiveSearchLoop()) return false;

    Debug.Log("["+Title+"::"+NodeIndex+"].isChildOf("+otherNode.Title+"::"+otherNode.NodeIndex+") --> starting loop");
    HashSet<Node> nodesToCheck = new HashSet<Node>();
    for (int i = 0; i < inputPorts.Count; i++)
    {
        ConnectionPort port = inputPorts[i];
        for (int t = 0; t < port.connections.Count; t++)
        {
            ConnectionPort conPort = port.connections[t];

            nodesToCheck.Add(conPort.body);
        }
    }

    Debug.Log("["+Title+"::"+NodeIndex+"].isChildOf("+otherNode.Title+"::"+otherNode.NodeIndex+") --> nodesToCheck.Count = "+nodesToCheck.Count);
    foreach (Node node in nodesToCheck)
    {
        if (node != startRecursiveSearchNode && (node == otherNode || node.isChildOf(otherNode)))
        {
            StopRecursiveSearchLoop();
            Debug.LogWarning("["+Title+"::"+NodeIndex+"].isChildOf("+otherNode.Title+"::"+otherNode.NodeIndex+") --> recursion found!");
            return true;
        }
    }

    EndRecursiveSearchLoop();
    return false;
}

The main difference here is that instead of calling isChildOf on every connected node I'm just adding them to a HashSet and looping on them afterwards.

It's possible I'm missing something but I don't really understand from reading through it how BeginRecursiveSearchLoop and EndRecursiveSearchLoop could possibly work to eliminate duplicates. I suspect they aren't quite working as intended.

I also suspect that we may get better performance from using a non-recursive method, as the call stack gets pretty tall when dealing with a large graph. Although, looking into my logs more I'm seeing that the call stack never gets past 4 deep in my double link test case. It just does 15 different combinations of the 4 layers.

Perhaps this is deserving of its own thread...

EDIT: @Seneral I just saw your edit about the static field. I was wondering about that, it makes sense. Are both fields supposed to be static, or just the start node?

        [NonSerialized] private List<Node> recursiveSearchSurpassed;
        [NonSerialized] private Node startRecursiveSearchNode; // Temporary start node for recursive searches

I'm not completely sure why this is not an issue with the other recursive nodes if this logic is broken.

SlimeQ commented 5 years ago

@Seneral making startRecursiveSearchNode static fixed the problem completely, I now have 256 double linked nodes and I'm barely slowing down. Thanks a lot!

SlimeQ commented 5 years ago

@Seneral I spoke a little too soon. Now I'm getting calculation traversal issues where only the 1st linked node gets calculated on change. I change node 0, and node 1 gets the change, node 1 has correct outputs, but node 2 does not get recalculated.

Almost there...

EDIT: Just starting to look into it so I'm not 100% sure, but I think it probably has to do with ClearCalculation not working properly after the static field change.

SlimeQ commented 5 years ago

@Seneral Just confirmed both fields need to be static. Now my calculation is working properly.

        [NonSerialized] private static List<Node> recursiveSearchSurpassed;
        [NonSerialized] private static Node startRecursiveSearchNode; // Temporary start node for recursive searches
Seneral commented 5 years ago

Yes, sorry am in a lecture. Just pushing the whole thing with some other optimizations, you can cherry pick if you want:)

Seneral commented 5 years ago

Sorry, only got to commit it now. The static keywords were the single most important change, yet for some other smaller improvements (especially related to clipping the bezier curves), check out 73bac80 Again sorry for the inconvenience coming from such a stupid error on my part. Glad it got resolved