Open rcorre opened 2 years ago
Having a look.
Notes:
instances.tres
contains images that went embedded directly inside the file as text. This will blow up the size of the resource (14Mb) and makes it slower to load. Maybe try not creating new ImageTextures
from the inspector? It also contains mesh data. Or could it be that you tried to "make unique"? I would also advise against that on a mesh.NoiseTexture
in the entire project. I guess it is the images you created? Or FastNoiseLite
?It's hard to tell yet at first what is going on but it seems like Godot is simply very badly handling the fact many rigidbody transforms are set at once in the editor (they are created for collisions). It tries to update configuration warnings of ALL the bodies which in turn is emitting signals which go to message queue and it's a callback-fest of deferred stuff going into the message queue, ultimately leading to... nothing, because these nodes are not editable in the scene tree anyways...
There are many bodies because when you remove the noise, the instancer will no longer attempt to filter density anymore. So it will spawn a lot more instances with collision in every chunk, so you'll end up with hundreds of physics bodies in your scene. While the physics engine might be able to handle this, the editor however is doing really inefficient things under the hood that kill performance.
I tried bumping the size of the message queue to 400Mb instead of the default 4Mb to see where it leads. Now removing the noise takes a second with no issues.
I had a look at Node::update_configuration_warnings
and it seems like it keeps emitting signals even if the node cannot be seen in the editor. I tried to change it, and it also fixed the issue:
#ifdef TOOLS_ENABLED
inline bool can_node_be_seen_in_editor(Node &node) {
if (!node.is_inside_tree()) {
return false;
}
const Node *edited_scene_root = node.get_tree()->get_edited_scene_root();
if (edited_scene_root == nullptr) {
return false;
}
if (edited_scene_root == &node) {
return true;
}
// Not sure why `is_ancestor_of` is required
return edited_scene_root->is_ancestor_of(&node) && node.get_owner() == edited_scene_root;
}
#endif
void Node::update_configuration_warnings() {
#ifdef TOOLS_ENABLED
if (can_node_be_seen_in_editor(*this)) {
get_tree()->emit_signal(SceneStringNames::get_singleton()->node_configuration_warning_changed, this);
}
#endif
Using signals like this isn't great though IMO. Because if these bodies happened to be visible instead (which is possible in other games with large scenes, who knows), the problem will also happen as soon as you do a bulk move. A better solution would be to store a version number which the scene tree dock can poll on visible nodes (never more than a few dozens). When it compares different to the last shown, update it. No faffing around with signals. But that sounds like a larger change to make.
your instances.tres contains images that went embedded directly inside the file as text. This will blow up the size of the resource (14Mb) and makes it slower to load. Maybe try not creating new ImageTextures from the inspector? It also contains mesh data. Or could it be that you tried to "make unique"? I would also advise against that on a mesh. I could not find any NoiseTexture in the entire project. I guess it is the images you created? Or FastNoiseLite?
Sorry for not clarifying. Here's what I did to get to this point:
I never deliberately created an ImageTexture or made anything unique. I'm guessing the mesh data is just how "Update from Scene" works, right? It doesn't seem to reference the original scene. As far as the noise textures go, are you always supposed to create them as separate resources? To me clicking "New FastNoiseLite" on the "noise" slot of the generator seemed like a pretty "normal" way to edit an instance generator.
Thanks for tracking down the godot issue! I'll try bumping the queue size for now.
remembered I need to save the VoxelInstanceLibrary as a separate resource for now
You dont need to do that. The workaround is to right-click on the resource property and choose Edit
, which will open it in a full inspector instead of a sub-inspector. The sub-inspector is why the issue occurs. But it's still nice to save this as its own file, because it can be a large resource.
I'm guessing the mesh data is just how "Update from Scene" works, right?
Not really, it wasn't intented. It could be that your scene originally had this ImageTexture
inside of it?
I think the issue is because Godot imported your .glb
as a .scn
under the hood, but decided to embed everything inside of it. As a result, since the instancer conversion only needed the mesh (so it could give it to MultiMesh
), Godot found that this mesh had no file of its own (since it came built-in inside a scene), and so it made a copy of it. Same story with materials and textures inside of them.
This completely sucks but I'm not sure how it could be improved. Having it automatically update if the scene changes would be desirable as well, so perhaps it needs to keep a reference to the scene and convert at runtime...
I'm not sure if this is the same issue a different one. I bumped the queue size to 409600. When I open my world scene, the CPU is relatively low at idle (< 5%).
When I select the VoxelLodTerrain or the VoxelInstancer in the editor, CPU usage climbs to 100%. If I select another node, the CPU drops to near-zero again. example.zip
It doesn't require me to change or regenerate anything, and the camera is positioned exactly the same. It's just a matter of whether or not the VoxelLodTerrain is the selected node in the editor.
https://user-images.githubusercontent.com/2496231/184543896-61273c2d-3e48-4317-a2e6-730a6958166e.mp4
If it doesn't freeze, I'm not sure what could cause all CPU cores to be used at 100% by just having either selected... it is possible for the module to use that much CPU if it has a lot of tasks running AND its threading configuration is changed to use the maximum, but by default it uses half only, and there doesn't seem to be any task running in your video (camera doesnt move and terrain likely finished loading a while ago). I also haven't noticed this on my computer. The only thing the module does when nodes are selected is to update realtime gizmos (which is done on the main thread only and is usually very cheap), which however dont seem to be enabled in your video. I'm curious where the CPU is spending its time here.
I can keep looking into it. It does seem to be a separate issue -- https://github.com/godotengine/godot/pull/64363 fixes the OP for me, but not this new issue.
all CPU cores to be used at 100%
Just in case you don't know, 100% on Linux (at least in top
) is 1 core. All cores would be 1200% on my CPU.
perf
sample:
20.18% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] Node3D::get_transform
19.40% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] __dynamic_cast
16.92% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] Node::get_child_count
10.12% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] __cxxabiv1::__si_class_type_info::__do_dyncast
9.14% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] Node3DEditorViewport::_calculate_spatial_bounds
4.15% godot.linuxbsd. libc.so.6 [.] 0x000000000015ff2b
2.67% godot.linuxbsd. libc.so.6 [.] 0x000000000015fef0
2.13% godot.linuxbsd. libc.so.6 [.] 0x000000000015fef4
2.09% godot.linuxbsd. libc.so.6 [.] 0x000000000015ff31
1.87% godot.linuxbsd. libc.so.6 [.] 0x000000000015ff17
1.64% godot.linuxbsd. libc.so.6 [.] 0x000000000015ff21
1.27% godot.linuxbsd. libc.so.6 [.] 0x000000000015ff2f
1.24% godot.linuxbsd. libc.so.6 [.] 0x000000000015ff27
1.13% godot.linuxbsd. libc.so.6 [.] 0x000000000015ff13
1.09% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] Node::propagate_notification
0.81% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] Node::get_child
0.77% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] __cxxabiv1::__vmi_class_type_info::__do_dyncast
0.69% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] Object::notification
0.24% godot.linuxbsd. libc.so.6 [.] 0x000000000016023f
Just in case you don't know, 100% on Linux (at least in top) is 1 core. All cores would be 1200% on my CPU.
In which case this is less concerning, but I'm still not aware of any task in the module requiring a full core to run constantly while a node is selected.
Not sure how to interpret your sampling, but it doesn't indicate a place in the module. An hypothesis could be that Godot is trying to figure out the bounding box of the selected node (every frame, for some reason) based on all its children, and since there is a fuckton of them, takes a while to do so. All this probably to show a bounding box around the selection maybe. I wouldn't think of this being slow even with hundreds of nodes, but maybe the operation isn't well optimized to begin with.
Sorry, that's not too helpful as is, I was running an optimized build. I might be able to get a call stack if I recompile. Would you like me to open a separate issue? I think the OP can be closed with your fix to Godot.
I'm not sure if that can be considered an issue though. It's just the Godot Editor doing something here which happens to be very slow. I have no control over this. Or I would have to migrate all these nodes to use servers directly, which however prevents some features from working easily such as obtaining the collider in game or displaying debug collision shapes.
19.40% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] dynamic_cast 16.92% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] Node::get_child_count 10.12% godot.linuxbsd. godot.linuxbsd.opt.tools.64 [.] cxxabiv1::si_class_type_info::do_dyncast
These make me sad.
get_child_count()
should just return a field. Why was it sampled so many times? Maybe it's directly called inside for loops and the compiler wasnt able to optimize it?
dynamic_cast
is hard to avoid because in Godot the scene tree can contain random stuff, not just 3D nodes. But a possible optimization is to have a is_node3d
shortcut virtual method which would likely be faster than a full-blown dynamic cast.
As for get_transform
, depends what the node does, but I'd expect it to be only a field return as well. But maybe it's doing more.
It's just the Godot Editor doing something here which happens to be very slow. I have no control over this
It probably is, though so far I can only repro it with voxel_tools. I tried adding 1000s of collisionshape children to a body, and adding 1000s of instances to a MultiMesh, and neither reproes.
After playing around some more, it seems this happens whenever I select the VoxelInstancer
or any node which is a parent of the VoxelInstancer
(the terrain or the root node). If I select a sibling, or a child of the VoxelInstancer, CPU returns to normal.
Here's an unoptimized profile with call stacks:
That makes sense because VoxelInstancer is the only one which actually creates child nodes for colliders. Everything else uses servers directly. But apart from instancing these nodes, it doesn't do anything else to them, they are static. You tried 1000 but maybe there are more. Also, they are bodies, not just collision shapes.
The call stack indeed confirms the 3D spatial editor is computing a bounding box from scratch every frame.
In the present use case it's also mostly pointless to compute that bounding box, but that cannot be prevented.
I filed https://github.com/godotengine/godot/issues/64398. I could repro it, but it took 9000 CollisionShape children to get to ~30% CPU usage. Is it a goal to eventually have VoxelInstancer use servers instead of nodes?
I'd like to add some UI to display how many instances are present in the scene so we can get a sense of numbers with VoxelInstancer. It was a choice for me to use nodes here because it allows to get a collider reference more easily. If I change it to use physics server, returned colliders will always be "VoxelInstancer" without information about which instance exactly was collided with. If there was a way to trace back to the instance it could be an alternative, but I'm not sure if that's possible, PhysicServer wants a single ObjectID. Another option is to reduce colliders in a radius and dynamically spawn/despawn them (in addition to the whole LOD system which already does that on a larger scale), but that comes with other issues such as non-player entities no longer being able to collide when far enough (also not mentionning multiplayer), and it also makes the code more complex, all for an issue that so far seems to only happen in the Editor.
With noise, there was 5,600 instances. Without noise, there was 11,168. Note: if instead of LOD index 2 I choose LOD index 1, and reduce density to 0.03 instead of 0.1, it goes down to 2,613, which is significantly less for a similar result from the point of view of the player.
Godot: 4.0.alpha.custom_build.8243c7ab5 Voxel: 31a7ddbf838572e50415159a56720275f9523262
When I try to remove the noise texture from the voxel instancer in the attached project, Godot consumes 100% CPU for around a minute, then crashes.
It prints messages like:
I sampled a few stacks throughout, they tend to have HashMap/MessageQueue near the top:
example.zip