minetest / minetest

Minetest is an open source voxel game-creation platform with easy modding and game creation
https://www.minetest.net/
Other
10.41k stars 1.98k forks source link

Inefficient Entity Rendering (Client-Side) #14018

Open Warr1024 opened 7 months ago

Warr1024 commented 7 months ago

Minetest version

Minetest 5.8.0-dev-0e4de2898 (Linux)
Using Irrlicht 1.9.0mt13
Using LuaJIT 2.1.0-beta3
BUILD_TYPE=Release
RUN_IN_PLACE=1
USE_CURL=1
USE_GETTEXT=1
USE_SOUND=1
STATIC_SHAREDIR="."
STATIC_LOCALEDIR="locale"

Active renderer

OpenGL 4.2

Irrlicht device

X11

Operating system and version

Debian 12

CPU model

Intel i5-3320M (4) @ 3.300GHz

GPU model

Intel 3rd Gen Core processor Graphics Controller

OpenGL version

4.2

Summary

It is a "well known" problem that entity rendering is inefficient (one draw call per entity, no attempt at batching). This can badly affect client frame rates in areas where lots of entities are present.

It has been suggested that this was not considered a serious issue because normally, entities have a big enough performance impact on the server that large numbers of entities become unmanageable on the server long before they become a problem on the client. However, there are many use-cases of purely "static" entities, which don't move or on_step, which do not incur notable server-side costs, even in quantities in the thousands. NodeCore's use of entities to represent item stacks in "node form" (or inside storage boxes) is an example of this.

Client-side performance issues with entities have thus been observed in real-world conditions on public servers, such as players storing large amounts of excavated cobble. A workaround was put in place in NodeCore to mitigate this, but it can only partially address the problem and there are still many cases where it doesn't have much impact; we could still really use a more thorough fix in the engine.

Steps to reproduce

minetest.register_chatcommand("test", {
    func = function(name)
        local player = minetest.get_player_by_name(name)
        local pos = vector.add(player:get_pos(),
            vector.multiply(player:get_look_dir(), 20))
        for dy = 0, 10 do
            for dx = -10, 10 do
                for dz = -10, 10 do
                    local p = vector.offset(pos, dx, dy, dz)
                    if dy == 0 then
                        minetest.set_node(p, {name = "nc_terrain:cobble"})
                    elseif math.abs(dx) <= (11 - dy) and math.abs(dz) <= (11 - dy) then
                        minetest.remove_node(p)
                        nodecore.item_eject(p, "nc_terrain:cobble 100")
                    else
                        minetest.remove_node(p)
                    end
                end
            end
        end
    end
})

screenshot_20231119_153118

screenshot_20231119_152850

Note that the workaround in NodeCore is a partial workaround and does not fix many instances of the problem (including real-world examples such as the spawn town area on NodeCore Community server), so a proper fix is still needed.

jordan4ibanez commented 7 months ago

To batch, you would need to collect the vao. I'm not sure if textures are within an array texture or whatnot. Not sure if it would need to change the code to allow for this. But if not then the next step is you'd have to get the uniforms, rotation, position, animation, blah blah, into one super buffer, then do the upload into the GPU and in that one call it should allow the brute force of the gpu to overtake any inefficiencies caused by this data stream uploading over the pcie bus. You're doing it already being not batched. But instead of it being one stream of nice linear data it keeps having to figure out what's where blah blah. I think this would speed up entity rendering A LOT.

jordan4ibanez commented 7 months ago

And yes, I would like to see this solved as well. Here is a reference video from a long time ago which solidifies the continuity of this issue. https://youtu.be/YcWjoaLmsnE?feature=shared&t=857

appgurueu commented 7 months ago

To batch, you would need to collect the vao.

This should be possible using instancing.

Caveats:

Warr1024 commented 7 months ago

If you only support very limited batching, such as:

I think it would still bring a lot of value, since a big proportion entities in the kind of scenes where problem surfaces will meet those criteria.

ghost commented 7 months ago

Petz aproves this.