Orillusion / orillusion

Orillusion is a pure Web3D rendering engine which is fully developed based on the WebGPU standard.
https://www.orillusion.com
MIT License
4.47k stars 551 forks source link

[FR]: 海量渲染WASM 申请内存时间太长 #380

Closed Davidyanlong closed 5 months ago

Davidyanlong commented 5 months ago

渲染30万个box, WASM申请内存时间太长, 渲染帧率低于 https://threejs.org/examples/?q=inst#webgl_buffergeometry_instancing_interleaved

class Sample_drawCallInstance {
    scene: Scene3D;
    public anim: boolean = false;
    async run() {

        Engine3D.setting.pick.enable = false;
        // init engine
        await Engine3D.init({ renderLoop: () => this.renderLoop() });

        OcclusionSystem.enable = false;
        // create new Scene
        this.scene = new Scene3D();

        // add performance stats
        this.scene.addComponent(Stats);

        // add an Atmospheric sky enviroment
        let sky = this.scene.addComponent(AtmosphericComponent);
        sky.sunY = 0.6

        // init camera3D
        let mainCamera = CameraUtil.createCamera3D(null, this.scene);
        mainCamera.perspective(60, Engine3D.aspect, 1, 2000.0);

        // add a basic camera controller
        let hoverCameraController = mainCamera.object3D.addComponent(HoverCameraController);
        hoverCameraController.setCamera(15, -15, 300);

        // add a basic direct light
        let lightObj = new Object3D();
        lightObj.rotationX = 45;
        lightObj.rotationY = 60;
        lightObj.rotationZ = 150;
        let dirLight = lightObj.addComponent(DirectLight);
        dirLight.lightColor = KelvinUtil.color_temperature_to_rgb(5500);
        dirLight.intensity = 100;
        dirLight.indirect = 1;
        this.scene.addChild(lightObj);

        sky.relativeTransform = dirLight.transform;

        // create a view with target this.scene and camera
        let view = new View3D();
        view.scene = this.scene;
        view.camera = mainCamera;

        // start render
        Engine3D.startRenderView(view);
        GUIHelp.init();

        GUIHelp.add(this, "anim").onChange = () => {
            this.anim != this.anim;
        };

        this.initScene();
    }

    private _list: Object3D[] = [];
    private _rotList: number[] = [];
    initScene() {

        let shareGeometry = new BoxGeometry();
        let material = new LambertMaterial();
        material.baseColor = new Color(
            Math.random(),
            Math.random(),
            Math.random(),
        )

        let group = new Object3D();
        let count = 30 * 10000;
        // let count = 200;

        GUIHelp.addFolder('info');
        GUIHelp.open();
        GUIHelp.addLabel(`use instance draw box`);
        GUIHelp.addInfo(`count `, count);

        let ii = 0;
        // let count = 30 * 10000;
        for (let i = 0; i < count; i++) {
            // let pos = Vector3Ex.sphereXYZ(20, 30, 0, 0, 10);
            let pos = Vector3Ex.sphereXYZ(ii * 60 + 20, ii * 60 + 100, 100, i * 0.001 + 10, 100);
            // let pos = Vector3Ex.getRandomXYZ(-2, 2);
            let obj = new Object3D();
            let mr = obj.addComponent(MeshRenderer);
            mr.geometry = shareGeometry;
            mr.material = material;
            obj.localPosition = pos;
            group.addChild(obj);
            this._list.push(obj);

            obj.transform.scaleX = Math.random() * 2 + 1.2;
            obj.transform.scaleY = Math.random() * 2 + 1.2;
            obj.transform.scaleZ = Math.random() * 2 + 1.2;

            obj.transform.rotationX = Math.random() * 360;
            obj.transform.rotationY = Math.random() * 360;
            obj.transform.rotationZ = Math.random() * 360;

            this._rotList.push((Math.random() * 1 - 1 * 0.5) * 2.0 * Math.random() * 100);

            obj.transform.localDetailRot = new Vector3(
                (Math.random() * 1 - 1 * 0.5) * 2.0 * Math.random() * 50 * 0.001,
                (Math.random() * 1 - 1 * 0.5) * 2.0 * Math.random() * 50 * 0.001,
                (Math.random() * 1 - 1 * 0.5) * 2.0 * Math.random() * 50 * 0.001);
            if (i % 10000 == 0) {
                ii++;
            }
        }

        group.addComponent(InstanceDrawComponent);
        group.transform.localDetailRot = new Vector3(0, 1.0 * 0.001, 0);
        this._rotList.push(1.0);

        group.bound = new BoundingBox(Vector3.SAFE_MIN, Vector3.SAFE_MAX);
        this._list.push(group);
        this.scene.addChild(group);
    }

    renderLoop() {
        if (this.anim) {
            let i = 0;
            for (i = 0; i < this._list.length; i++) {
                let element = this._list[i];
                // element.transform.rotationY += 1;
                element.transform.localChange = true;
            }
        }
    }
}
  1. 分配WASM时间 从10:45:02.927 - 10.45.24.316 用时21秒多 33fa0507bfe4733f8cbe0b54a4a811f

  2. 测试渲染30万了立方体 WASM计算就耗了很多时间,需要改进 d7c5cd56ce1a069d797ea8c1e621cfe

lslzl3000 commented 5 months ago

感谢测试反馈 初次内存申请时间长的问题我们已经收到,后续会改进 性能问题来自两个方面:

  1. 目前 wasm 在 devtools 开启时会有额外的消耗,性能会降低,可以关掉devtools后进行测试
  2. chrome webgpu 目前对于写入 gpu 内存的性能存在问题,比如 https://bugs.chromium.org/p/chromium/issues/detail?id=1298309&no_tracker_redirect=1
    目前性能是比不上 webgl 的API的,所以对于大量transfrom的更新,速度可能比 three 还慢一些,这个需要chrome底层去优化更新
Davidyanlong commented 5 months ago

感谢Orillusion团队的重视,后续的使用学习中希望能得到更多的帮助!

Davidyanlong commented 5 months ago

根据https://codepen.io/ShuangLiu/pen/YzEEmLa 的测试,感觉使用mapAsync 的方式优于 writeBuffer 的方式,单独测试mapAsync 与 WebGL的‘bufferSubData’ 几乎差不多,引擎会考虑 mapAsync的方式吗?

lslzl3000 commented 5 months ago

根据https://codepen.io/ShuangLiu/pen/YzEEmLa 的测试,感觉使用mapAsync 的方式优于 writeBuffer 的方式,单独测试mapAsync 与 WebGL的‘bufferSubData’ 几乎差不多,引擎会考虑 mapAsync的方式吗?

mapasync 的理论性能确实好一些,但实际工程化过程中会遇到几个问题:

  1. 大量同步操作变成异步操作会增加代码结构复杂度,不方便开发调试和维护
  2. js的单线程队列回调机制在实时渲染这种场景下,异步回调难以保证实时性和顺序,会受到其它异步操作或事件的干扰,即使mapasync 内部及时很快完成,但异步的回调也无法及时在js主线程中触发,导致实际延迟可能比 writebuffer 还高,实际综合表现就是 fps 很不稳定,尤其在有ui操作,网络通信时会有突发的高延迟、卡帧等现象。所以 mapasync 比较难优化,一般较适合一个大内存临时写入或读取时用一下,一般基础的每帧都要更新的buffer,比如 transform 这种,还是用 writebuffer 这种比较有确定性的同步操作更合适
  3. Chrome mapasync 内部 (Dawn) 的优化机制也不算完善,https://codepen.io/ShuangLiu/pen/YzEEmLa 这个例子已经说明问题,当有其他 webgpu api 操作时,mapasync 的延迟会大大增加,它内部机制也是等待 GPU 空闲状态下才会去进行操作,如果有其他操作正在进行,mapasync 的延迟就会不稳定,尤其是有大量 compute shader 运行时,实际工程中的延迟基本和 writebuffer 相同甚至更大,远不如 webglbufferSubData 的性能

综上,目前用 mapasync 效果可能还不如 writebuffer,这需要 webgpu 标准的迭代和优化,比如支持多线程的gpu内存写入操作,也需要 Chrome 的内部的实现去优化,释放更多的底层性能,后续我们也会跟进尝试更多的使用 mapasync 的操作场景

Davidyanlong commented 5 months ago

感谢 @lslzl3000,您解释的很清楚,学到了。很高兴能得到您的解答.