godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
90.35k stars 21.06k forks source link

Very slow import when scene has big meshes due to mesh LOD generation #64751

Open PZerua opened 2 years ago

PZerua commented 2 years ago

Godot version

4.0.alpha14

System information

Windows 11, Intel i7-10750H, Nvidia RTX 2060 Laptop (511.65), Vulkan

Issue description

Godot takes a lot of time to import a scene with big meshes. To better understand the problem I've been doing some tests with the new "Colorful Curtains" (without Base Scene) from the new Intel's Sponza scene. The scene has twelve 4K textures and several meshes that add up to a total of 1.059.862 vertices, and while Blender only takes ~7 seconds, Godot takes ~1 minute and 3 seconds to import. I've spent some time investigating the causes with a profiler and I've found that from that import:

So I'd say the main issue is with LOD generation. Two observations:

Some possible changes I can think of to make it faster:

I explained the issue a bit over Rocket.chat a few days ago, but I'd like some discussion on this before I attempt a fix (if I'm capable).

Steps to reproduce

Download "Colorful Curtains" or any scene from the new Intel Sponza. Move GLTF scene to project folder and see it takes a very long time to import.

Minimal reproduction project

No response

fire commented 2 years ago

The largest cost in your numbers is casting a total of 10.739.327 individual rays using Embree. This process takes 33 seconds of the total time. Is there a way to improve this?

The smaller issue is the number of lods and the bigger issue is the normal reconstruction.

Sloppy simplify didn't give edge lengths, so I don't think we can use. Can check again.

fire commented 2 years ago

Did some thinking. One cheap thing we can do is start from the last lod rather than from the start. The code to do this is relatively small.

Do you want to make a pr for that?

PZerua commented 2 years ago

Hi, sorry for the delay.

The largest cost in your numbers is casting a total of 10.739.327 individual rays using Embree. This process takes 33 seconds of the total time. Is there a way to improve this?

I agree that is the bigger problem, but I haven't researched enough to come out with a solution or alternative approach. I did test using rtcIntersect1M once for all the rays instead of rtcIntersect1 for each single ray, both with RTC_INTERSECT_CONTEXT_FLAG_COHERENT and RTC_INTERSECT_CONTEXT_FLAG_INCOHERENT, but I noticed no difference. I have no prior experience with Embree, so might be worth trying again in case I did something wrong. Maybe @JFonS has some input on this and can propose some alternatives.

The smaller issue is the number of lods and the bigger issue is the normal reconstruction.

The thing is that the total amount of rays is directly related to the amount of LODs (and the amount of indices in each LOD), so if we agree >= 10 LODs are too much and aim for a maximum of 6 or 8, that would help for both issues.

Did some thinking. One cheap thing we can do is start from the last lod rather than from the start. The code to do this is relatively small.

Do you want to make a pr for that?

Yeah, I also thought on the same thing. This will speed up LOD generation when calling meshopt_simplify (although not sure how much), but won't help with the total ray count. I can give it a try in a few days.

fire commented 2 years ago

I can't promise anything but if you're around I can show you where the code for start from the last lod.

https://github.com/godotengine/godot/blob/master/scene/resources/importer_mesh.cpp#L453

The theory is instead of the last merged_indices_ptr, you use the last while loop new_indices.

PZerua commented 2 years ago

Hi, sorry for the delay, I've been quite busy at work.

Still want to work on this and I think I have an idea of how to implement it, but not sure when I'll have time to do it.

Also, I spent some time trying to understand better the context of the "normal reconstruction", and saw the discussion you had here: https://github.com/zeux/meshoptimizer/issues/158. So my understanding is "normal reconstruction" is currently a workaround for that issue and we should just wait to be fixed from meshoptimizer's side, although maybe is worth checking faster approaches in the meantime.

PZerua commented 1 year ago

Looks like we might get a fix for the wrong normals after simplification https://github.com/zeux/meshoptimizer/pull/524. Hopefully this will make possible to remove all the calls to Embree and make import much faster.

zeux commented 1 year ago

I'll note that it's unclear if the pending work in the linked meshoptimizer PR will allow Godot to change its simplification strategy - meshoptimizer version used in Godot right now has some patches to enable attribute awareness, but they were likely insufficient to get good normal quality in certain cases which is why the reprojection code exists. My goal is to improve on the patches currently used in Godot (they have some quality bugs that are critical to resolve before I can merge anything), but I don't know if the improvement is going to be sufficient to just rely on output of meshoptimizer directly in all cases.

fire commented 11 months ago

One of the bottlenecks is tangent space normal generation which is being worked on here https://github.com/godotengine/godot/pull/83648

zeux commented 3 months ago

Should be improved by #93727 (still some work to do in the future wrt reordering LOD generation from large to small).