axmolengine / axmol

Axmol Engine – A Multi-platform Engine for Desktop, XBOX (UWP) and Mobile games. (A fork of Cocos2d-x-4.0)
https://axmol.dev
MIT License
868 stars 195 forks source link

iOS ClippingNode performance issue? #1094

Closed Yehsam23 closed 1 year ago

Yehsam23 commented 1 year ago
  1. I used 30 ClippingNodes and 30 Sprites, and the FPS dropped to around 20. IMG_0143

  2. With cocos2d-x v3.17, the FPS is 60 even when using 30 ClippingNodes and 30 Sprites. IMG_0142

  3. With cocos2d-x v4, using 200 ClippingNodes and 200 Sprites still results in an FPS around 60. IMG_0144

All the same device

rh101 commented 1 year ago

The test project you attached doesn't have code related to clipping nodes.

Yehsam23 commented 1 year ago

Sorry, I misplaced the file The above has been updated

HelloWorldScene.cpp:

    auto visibleSize = _director->getVisibleSize();
    auto origin      = _director->getVisibleOrigin();

    for (int i = 0 ; i < 30 ; i++)
    {
        auto stencil = Sprite::create("stars2.png");
        auto clipNode = ClippingNode::create(stencil);
        clipNode->setPosition(origin + Vec2(random(0, (int)visibleSize.width), random(0, (int)visibleSize.height)));
        clipNode->setAlphaThreshold(0.1f);
        clipNode->setInverted(true);
        this->addChild(clipNode);

        auto sprite = Sprite::create("grossini.png");
        clipNode->addChild(sprite);
    }
rh101 commented 1 year ago

Just an FYI, you've noted axmol version: lastest in your post, but the project you attached is not using the latest template, so a few changes were required to get it working.

Now, regarding the issue, setting the number of clipping nodes to 30 didn't show a problem on Android, using an old HTC One M7, but upping it to 200 managed to get the FPS down to ~10 on both Cocos2d-x v4 and Axmol.

One question though, given clipping nodes are expensive, is there a common use case where there is a need to use so many of them?

rh101 commented 1 year ago

@Yehsam23 Something isn't quite right regarding your results. The first two screenshots look like they're taken at the same design resolution, but the last one, which you've set to 200 clipping zones, isn't from the test code that you have attached. That last screenshot looks like it came from the cpp-tests, and it's at a completely different design resolution.

Please do the 200 clipping zone test with a new project on Cocos2d-x v4, on the same device, similar to the one you've attached to your initial post, then take a screenshot and post the result here. The scale of the images in the output should look exactly the same as the first two screenshots you posted.

I cannot reproduce the differences you're seeing. On an Android HTC One M7 device, using 200 clipping nodes with the code you have supplied, I see 60 FPS using Cocos2d-x v4, and 60 FPS using Axmol, but you have to wait a few seconds for it to stabilize on both engines (because of the loop in init(), which takes time to process). Also, if any touch events occur, the FPS suddenly drops below 40 on both engines, and then goes back up to 60.

Yehsam23 commented 1 year ago

The FPS is around 9 when I use my Android HTC U11 Plus with Axmol and cocos2d-x v4, with the same design resolution. However, when I use my iPhone 12 Pro Max, the FPS is around 13 on Axmol and around 40 on cocos2d-x v4. Therefore, it seems to only occur on iOS, and I have updated the content of the article accordingly.

Previously, I tested on an iPhone 7 Plus, but I don't currently have access to that device. Therefore, I tested on an iPhone 12 Pro Max instead.

iPhone 12 Pro Max on Axmol 200 clipping: axmol

iPhone 12 Pro Max on cocos2d-x v4 200 clipping: v4

BTW, design resolution is 640 * 288, content scale is 2

glView->setDesignResolutionSize(640, 288, ResolutionPolicy::NO_BORDER);
director->setContentScaleFactor(2);
Yehsam23 commented 1 year ago

I change RenderTarget.h

    void setTargetFlags(TargetBufferFlags flags) { 
        _flags = flags; 
       _dirty = true;
    }

to

      void setTargetFlags(TargetBufferFlags flags) { 
        _flags = flags; 
    }

FPS will surpass cocos2d-x v4.

I'm not sure if the modification is correct. I see that there is already a flag check before the renderTarget->isDirty() check in CommandBufferMTL.

rh101 commented 1 year ago

I'm not familiar with the rendering code, but just out of curiousity, what happens if you change the method:

void setTargetFlags(TargetBufferFlags flags) { 
    _flags = flags; 
   _dirty = true;
}

to:

void setTargetFlags(TargetBufferFlags flags) { 
    if (_flags == flags)
        return;
   _flags = flags; 
   _dirty = true;
}

Just so it's not setting it as dirty unnecessarily.

Yehsam23 commented 1 year ago

Because there are cases where setTargetFlags may first set A, render, then set B, and then set back to A and render again, the final result of the flag is actually the same as the last time, but it is set as dirty. Below is the order of one ClippingNode and 1 Sprite:

1.DEPTH_AND_STENCIL | COLOR0
2.COLOR0 | STENCIL
3.COLOR0 | STENCIL
4.DEPTH_AND_STENCIL | COLOR0
5.COLOR0 | STENCIL
6.COLOR0 | STENCIL
7.drawCustomCommand->beginRenderPass->CommandBufferMTL
8.drawBatchedTriangles->beginRenderPass->CommandBufferMTL (use old _mtlRenderEncoder)
9.DEPTH
10.DEPTH_AND_STENCIL | COLOR0
11.COLOR0 | STENCIL
12.drawBatchedTriangles->beginRenderPass->CommandBufferMTL (create new one because flags dirty)

Based on the information provided, it seems that the flags set during the last execution of CommandBufferMTL are always COLOR0 | STENCIL, even though dirty is set to true.

Additionally, the CommandBufferMTL checks if _currentRenderTargetFlags == renderTarget->getTargetFlags()

Therefore, it seems that setTargetFlags does not need to be set as dirty.

rh101 commented 1 year ago

@Yehsam23 I tried out your modification by commenting out the _dirty = true; in setTargetFlags, and I haven't noticed any negative side-effects at all. What I have noticed is a significant increase in performance on iOS (much higher FPS).

Once again though, I don't know much about the rendering code, and @halx99 is the one who would understand all of this.

Yehsam23 commented 1 year ago

@halx99

Although Android has a similar FPS to cocos2d-x v4 under the same design resolution, there is a significant difference between it and cocos2d-x v3.17. For example, when using 200 ClippingNodes and Sprites, the FPS in Axmol is around 9, while in cocos2d-x v3.17 it is around 41 on an Android HTC U11 Plus device.

Is there a way to optimize this? Our project currently uses ClippingNodes extensively to handle some animation effects, so we are hoping to find a solution to improve performance.

halx99 commented 1 year ago

The dirty can be remove from setRenderTarget

rh101 commented 1 year ago

@Yehsam23 Are you able to submit the pull request with the modification to setTargetFlags() to remove the `_dirty = true;' line, or do you need someone else to submit it? At least that will fix the iOS issue once it's merged in.

For the Android clipping node performance improvement, perhaps create a separate issue for it, just so it can be tracked separately, since it's unrelated to the iOS problem.