CesiumGS / cesium

An open-source JavaScript library for world-class 3D globes and maps :earth_americas:
https://cesium.com/cesiumjs/
Apache License 2.0
12.8k stars 3.46k forks source link

Performance degradation with large point clouds and show condition on M1 Mac #11270

Open ulrichson opened 1 year ago

ulrichson commented 1 year ago

Hello!

I observed a performance degradation when rendering large points clouds that use the conditional show attribute. I'm certain that the performance was much better, in a project I'm using such a technique to filter for attributes from the tileset's batch table and now the framerate drops a lot when the filters are applied. Maybe it only affects macOS / arm architecture — on a Linux / Intel workstation with NVIDIA GPU it works fine.

A quick performance monitoring in the dev tools showed that readPixels consumes a lot of time which is called for scene picking. Is there a way to turn off picking for Cesium3DTileset ? That might help in my case.

I can reproduce this in a sandcastle with the Melbourne point cloud, and you can see the framerate drops from ~100 fps to ~10 fps when zooming in:

https://user-images.githubusercontent.com/449600/236436953-c1d84f1b-3a67-4a1e-a663-8de8214e834a.mov

Sandcastle example: https://sandcastle.cesium.com/#c=hVNha9swEP0rRxiNA0FO10K71AkbKYyMjkBT9slfFPkai8m6IJ2TuiX/vXJsZ0nZmMFg3b33dO9JjmP47qRlmKHXZfFjCVIp9B6YoKLSgSYL0ntkn1pF1jNsNe7QwQQs7lqa+HWoRWlPHdYzsiy1RZf2BnepTW0kfWUVRAOYTOEttQDsquYDWkGBL4w2i1rFptgsru6ftEE/t36Disn91C/a1sI1uxmKawBymErupO7siDM+snh2VMzJfqsNzbPo+urLzW0n1EqIDWnLM0Nltsxlpu06iLaThkdymLKUHHIZBxMlDo8trPCeCnzQ65wD7T/tJTu0a87HcPlPzGPYv/R/EPu7s8S8Qoti43ShWW/RC5llUWujc9XE0RJeiYon+gjpjHuuDJ6f62l+y7odHZM4xLTUrziG66MBRYbcGPpuvYo+j0bD9h30jwif0y4APr3NFg+Lx71wMIWRuIGLCzipJaF2229NN3PuQUlWOUToHLlBdyT18ZNBYWjddhp0aveDqP7uDXvJwdm0rn/VxYYcQ+lMJETMWGyMZPTxqlS/QwbK+5qUxB0lyfQWdDb5y9UGZcKfETrPpTF1EmlvmsQBf0YzdLhEiy06I6sAqcdI8svpQ9MQQiRxWNabfuQykVlJd6L7Dg

Browser: Google Chrome Version 113.0.5672.63 (Official Build) (arm64)

But similar experience with Safari / Firefox

Operating System: macOS 13.3.1 (a) on M1 Max

ggetz commented 1 year ago

Thanks for the report @ulrichson!

I'm certain that the performance was much better

Do you mean in a previous release, or just when an M1 machine is not being used?

We have seen some rendering problems on M1 machines, but they tend to be rendering artifacts rather than degraded performance.

In general, readPixels is an expensive operation, and we'd (hopefully soon) like to optimize picking to avoid it as much as possible.

ulrichson commented 1 year ago

@ggetz Glad to hear, thanks!

I mean with a previous Cesium release. To verify I just ran an older version of my project where Cesium 1.91.0 was used and here the performance is good on M1. With 1.105.0 the performance is worse. I don't know between which versions the performance drop happened. If it helps, I can try to figure it out.

Is there a way to disable picking for Cesium3DTileset (assuming it then wouldn't call the expensive readPixels)? I wouldn't use it for point clouds anyway.

ggetz commented 1 year ago

I mean with a previous Cesium release. To verify I just ran an older version of my project where Cesium 1.91.0 was used and here the performance is good on M1. I don't know between which versions the performance drop happened. If it helps, I can try to figure it out.

Thanks for clarifying! And we'd definitely appreciate the help. 1.97 would be the first release I'd suspect that this began occurring. There was a major 3D Tiles and model refactor that went out as a part of that release.

Is there a way to disable picking for Cesium3DTileset (assuming it then wouldn't call the expensive readPixels)? I wouldn't use it for point clouds anyway.

Not through the public API. If you're willing to modify the source code, it's possible to skip the 3D Tiles pass when picking.

ulrichson commented 1 year ago

@ggetz I re-tested with 1.97 and I can confirm that the performance drop was introduced with this version. The previous release had better performance.

Can you please give me a hint where in the source code this change would be required?

ggetz commented 1 year ago

There was a fairly large refactor that went in with that change. @j9liu would you be able to recommend a place to start looking regarding changes from ModelExperimental related to performance picking point clouds?

ulrichson commented 1 year ago

@j9liu @ggetz any updates, on how to disable point cloud picking in the cesium code? thanks!

j9liu commented 1 year ago

Hey @ulrichson,

Sorry about the delay. This completely slipped my radar. I'm happy to take a look -- I should have a moment by the end of this week.

ulrichson commented 1 year ago

Thanks @j9liu 😊

j9liu commented 1 year ago

Hi @ulrichson,

I don't have an M1 Mac, and I can't reproduce this behavior on Windows. But I'm curious to see the debug profile that demonstrates the difference with / without this condition. readPixels has always been slow, so I'd want to know why the function read is slower with the show condition on (if that's what you claim). It'd be different if the time leading up to the readPixels is slower -- I imagine the buildDrawCommands in Model could be slowing things down, since it's called every time a style is applied.

In any case, there's an individual allowPicking on the individual Model level. Even though there's no equivalent on Cesium3DTileset, it itself uses Model to render tile content. You can experiment with setting it false in the constructor and seeing if it helps. Let us know how it goes so we can try to troubleshoot from there

ulrichson commented 1 year ago

Hi @j9liu, thanks, I'll give it a try!

Here's a snapshot of the profile on my M1 machine when the show is set as in the sandcastle example above:

Hello World - Cesium Sandcastle 2023-07-07 16-52-31

ulrichson commented 1 year ago

Update: maybe it's not related to the readPixels (and picking) as I thought. The project I'm working on is using the show attribute, and here readPixels doesn't seem to be an issue. Did possibly change something in the shaders that could cause this?

j9liu commented 1 year ago

In 1.97 we switched to a completely different Model implementation, so yes. But I have a hard time understanding what in the new architecture could be slowing things down.

Here's the state of CesiumJS in the 1.96 release. You can see how PointCloud.js was used to render point cloud content, instead of Model. Here is where the point cloud shaders were created.

In the new Model architecture, the shaders of a Model are built incrementally using various "pipeline stages". So the pipeline stage responsible for adding point cloud styling code is PointCloudStylingPipelineStage.js. But you can see at line 290, the show condition function is derived the same way as it was in PointCloud.js.

So that's why I'm confused. If the new Model architecture was slower as a whole, I would understand, because it takes more time to construct / reconstruct shaders for a model, and that can get exacerbated by picking. But I don't see how simply adding a show styling condition makes everything slower. If you're able to save the performance profiles you gather in Chrome (both with the show condition and without), and attach them, I can look at them more closely.

ulrichson commented 1 year ago

Sure, thanks for looking into it!

Here's the profile with show: trace-with-show.json.zip

And without: trace-without-show.json.zip

j9liu commented 1 year ago

Thanks @ulrichson ! I'll try to take a look by the end of this week :)

j9liu commented 1 year ago

@ulrichson

Thanks for your patience. I took a look at the profiles and could also see that the readPixels function was taking 3x as much time with the show condition, vs. without.

I'm trying to think of why this could be. Picking actually involves re-rendering the scene in a small area, then sampling that rectangle for an object. So technically, with picking, the scene (and all of its models) is updated and drawn twice. This is definitely not the most efficient method for picking, but perhaps picking is exacerbated by things that slow down the render loop itself.

I've looked at these parts of the profile, and it does seem like the biggest offender is PickFrameBuffer.end (which then calls readPixels, but the time that render takes with the show condition is strangely less than the time without the condition. Granted, these profiles could be changed based on how much the mouse is actually wiggling across the screen during the time of recording. But it's hard to tell why that's happening and what's actually going on.

Without condition With
image image
image image

Unfortunately, in the profiles I can't go any level deeper than "Animation Frame Fired" -- I can't see if there's any bottleneck in particular when the scene renders. But the GPU performance is definitely worse with the show condition. It looks like the longest time it takes on the GPU without the show condition is 6ms, where the GPU sometimes goes above 100ms with the show condition.

If I could reproduce this myself, I would go through the Model architecture and try to leave out extraneous parts of the pipeline, like the PickingPipelineStage or the PointCloudStylingPipelineStage, and see what happens. This part of the shader code is what is responsible for the show condition in point clouds: I also wonder if it makes a difference if the show condition is always set to true. In other words, is it the presence of the show condition that slows things down? Or is it the evaluation of the condition itself?

Update: maybe it's not related to the readPixels (and picking) as I thought. The project I'm working on is using the show attribute, and here readPixels doesn't seem to be an issue. Did possibly change something in the shaders that could cause this?

Does this imply that the show attribute still causes issues, even with picking disabled for point clouds?

ilyaly commented 1 year ago

We are experiencing the same performance issue , see #11196 . Our tilesets are generated with Agisoft Metashape. Prior to version 1.97 there were no problems but starting from 1.97 we observe the same behavior as reported in this issue the only difference is that we do not apply any styling.

ulrichson commented 1 year ago

@j9liu Thanks! I tested with show: 'true' (and also an expression that always evaluates to true, i.e. show: '${COLOR}.r > -1 && ${COLOR}.r < 2' and then there's no performance drop. Same observation as without the show.

Does this imply that the show attribute still causes issues, even with picking disabled for point clouds?

It seems so, I made the following change in packages/engine/Source/Scene/Model/ModelRuntimePrimitive.js‎:

if (model.allowPicking) {
  console.warn('Disabled PickingPipelineStage')
  // pipelineStages.push(PickingPipelineStage);
}

With the disabled PickingPipelineStage the performance still drops - so it's really not related to picking.

I did another test and tried to fake the show behavior with a conditional color style, i.e. color: '${COLOR}.r > 0.7 && ${COLOR}.r < 0.8 ? rgb(200,200,200) : rgba(0,0,0,0)' (see Sandcastle example). Now, the performance doesn't drop, so I think it can be narrowed down to the show-behavior. Unfortunately, I can't use this workaround since the pointCloudShading doesn't work with transparent colors.

ulrichson commented 1 year ago

Another observation: I checked if pointCloudShading together with show causes the performance drop - but that's not the case.

aurivus-ph commented 1 year ago

I have a similar observation with a large performance hit on integrated Intel graphics (in particular Intel® UHD Graphics for 10th Gen Intel® Processors as in the i7-10510U). The issue does not show on the dedicated GPUs I have access to.

I have found a one-line fix, that improves the performance a lot in my case:

In packages/engine/Source/Shaders/Model/CPUStylingStageVS.glsl in cpuStylingStage(), comment out the following line:

void cpuStylingStage(inout vec3 positionMC, inout SelectedFeature feature)
{
    float show = ceil(feature.color.a);
    //positionMC *= show; // this line causes the performance hit

    #if defined(HAS_SELECTED_FEATURE_ID_ATTRIBUTE) && !defined(HAS_CLASSIFICATION)
    filterByPassType(positionMC, feature.color);
    #endif
}

Note that you may need to re-build Cesium to re-generate the CPUStylingStageVS.js file. Otherwise the changes won't take effect.

To anyone who is able to reproduce this performance issue, please check if this improves the performance for you.

My theory is as follows:

A proper fix for that would be to actually move the hidden points off-screen, i.e. setting gl_Position accordingly.

katSchmid commented 1 year ago

We have that on all Os and hard ware variants. I have a gui setting shader dynamically. Even applying a new style without anything but point size keeps low frame rates. ReadPixels is big in my profile for me also when setting pixel colors.

Applying a new default style is a large performance hit on fps

katSchmid commented 1 year ago

Any updates, we are no changing our data as a workaround but love to get this back

jjrise commented 3 weeks ago

@katSchmid did you ever find any resolution to this?