Improve lazy loading of implicit subtrees

ptrgags commented 1 year ago

The initial implementation of implicit tiling from 3D Tiles 1.1 is a naive algorithm: it transcodes an entire implicit subtree at a time when the content is loaded (this is modeled off external tilesets). The child subtrees are lazy, however, only when they become visible are they transcoded.

This works best for many, small subtrees (how small exactly depends on quadtree vs octree, how many available tiles, etc). For large subtrees, the transcoding can be slow. See for example this forum post.

It would be better if we could do a similar sort of lazy loading that Implicit3DTileContent does, but within a subtree.

A couple possible approaches:

modify Implicit3DTileContent so it can be used as a placeholder not just as a child subtree but as any level within the tree. However, this might be a bit tricky due to how the Cesium3DTileContentFactory assumes each tile corresponds to a content URI...
or create some new type of content, maybe ImplicitLazy3DTileContent that acts as a placeholder for the in-between tiles, and Implicit3DTileContent only acts as the one for the root of each subtree.

ptrgags commented 1 year ago

Another thing to note: when we have placeholder tiles, the actual subtree root tile was a child of the placeholder tile (though they have the same bounding volume). I forget the reason for that, something about replacing the placeholder tile made things complicated.

That said, if we allow placeholders everywhere, we'd be allocating a lot of extra placeholder tile objects (especially for large worldwide data sets). It might be worth revisiting the idea of replacing the placeholder tile after transcoding the content.

javagl commented 3 months ago

I tried to reproduce and analyze this issue. I did a quick profiling run based on the data that was provided by @bertt in https://community.cesium.com/t/black-screen-on-loading-larger-tileset-with-implicit-tiling/21460 . (Note: The sandcastle still uses the old CesiumJS async API - an updated one is attached below). There are a few unknowns, but I'm reasonably sure that I found the place where the delay is caused: The profiling pointed out Implicit3DTileContent.fromSubtreeJson/expandSubtree as the culprit.

Looking at the data set from that sandcastle:

It uses a single subtree file with 1.3MB, for an implicit quadtree with subtreeLevels=12 (which is equal to the total number of levels)
The subtree has...
- tileAvailability: availableCount:68205
- contentAvailability: availableCount:50812

So I created a dummy test data set with similar characteristics: An implicit quadtree, with 12 levels in a single ~1.3MB .subtree file, with 161721 and 34969 available tiles/contents. Note that this is really dummy data: It always uses the same tile content (a unit square), to isloate the point of the subtree loading. This is a judgement call: It might help to focus on the issue. I hope that it does not distort anything in a way that distracts from the actual issue.

manySubtreeLevels DUMMY DATA 2024-03-05.zip

It can be loaded with this sandcastle:

const viewer = new Cesium.Viewer("cesiumContainer", {
  globe: false
});

const tileset = viewer.scene.primitives.add(
  await Cesium.Cesium3DTileset.fromUrl(
    "http://localhost:8003/tileset.json", {
    debugShowBoundingVolume: true,
    //maximumScreenSpaceError: 100
  })
);

const transform = Cesium.Transforms.eastNorthUpToFixedFrame(
  Cesium.Cartesian3.fromDegrees(-75.152408, 39.946975, 20)
);
const scale = 15.0;
const modelMatrix = Cesium.Matrix4.multiplyByUniformScale(
  transform,
  scale,
  new Cesium.Matrix4()
);
tileset.modelMatrix = modelMatrix;

const offset = new Cesium.HeadingPitchRange(
  Cesium.Math.toRadians(0),
  Cesium.Math.toRadians(-45.0),
  40.0
);
viewer.zoomTo(tileset, offset);

tileset.allTilesLoaded.addEventListener(function() {
  //tileset.debugShowBoundingVolume = true;
  console.log("Done");
});

Wrapping a pragmatic timing around the expandSubtree call and hitting "Run" a few times prints timings that this call can take ~"several seconds". This is considerable (there's not really anything "loaded" at this point, except for the (unused/empty) subtree...). Unfortunately, the time varies between 2 seconds and 5 seconds, which makes it nearly impossible to make profound statements whether a change was only a change, or an actual improvement here. Maybe that single function call will have to be carved out into a "unit test" for a more dedicated analysis.

The "hot path" from a profiling run is shown here:

Cesium Implicit Delay

There certainly are a few things that stand out. One could be that there are some matrix multiplications that are unnecessary when the tile.transform is the identity matrix (which is the case here). On the one hand, this may be seen as the "wrong 2%" and a premature (micro)optimization. On the other hand ... : Why is this multiplication done there in the first place?!? Similarly, the computeBoundingVolume seems to take a considerable amount of time, and one could consider to use a simpler version for the case that the transform is the identity matrix, and maybe even a completely specialized one for the regular structure of implicit tilesets. The fact that it seems to spend nearly 10% of the time in ImplicitTileCoordinates.getChildCoordinates may warrant some scrutiny as well.

However, all this may be "obsolete" when a real "lazy loading" is implemented. But in order to develop some ideas here, someone might have to chime in with some info about that concept of "placeholder tiles", or where such a "lazy content" could come into play.

Updated sandcastle for the forum thread at https://community.cesium.com/t/black-screen-on-loading-larger-tileset-with-implicit-tiling/21460

const viewer = new Cesium.Viewer("cesiumContainer", {
  //globe: false
});

const tileset = await Cesium.Cesium3DTileset.fromUrl(
  "https://storage.googleapis.com/ahp-research/maquette/cesium/buildings_z/tileset.json");

viewer.scene.primitives.add(tileset);
viewer.zoomTo(tileset);

javagl commented 2 months ago

I did create a bit more test data. This is attached here:

Cesium-manySubtreeLevels-2024-04-28.zip

The archive contains some dummy quadtrees and octrees, with different numbers of availableLevels and subtreeLevels. They all have a fixed "leaf rate" of 80, meaning that 80% of their leaf tiles are randomly set to have content. This content is always the same file (this is focussed on the subtree handling for now). The archive also includes a sandcastle for basic performance tests.

NOTE: Some configurations and interactions will certainly cause the browser window to hang up and/or throw an 'out of memory error'. The whole test is pretty contrived, but unless we can do dedicated benchark runs of the data that causes actual problems, and on the machine where the problem appears, and unless we can look into the 'Profiling' of the browser when the benchmark is run, we have to poke around a bit, and find the cases that might cause trouble for real-world tilesets.

The sandcastle shows that there indeed are certain configurations where the structure of the tileset can cause large delays until the initial set of tiles is loaded. For example, it can be seen that for a quadtree with 8 levels and 4 subtree levels, loading the initial set of tiles takes a fraction of a second. With 10 levels and 5 subtree levels, it takes about 8 seconds:

Cesium Subtree Loading

And at the bottom, one can see why: It is requesting all the .subtree files.

Now, whether or not this is the actual (or only) reason for the problems that have been reported is hard to say. But... when making the window a bit smaller, then it would not request the next subtree level, and all initial tiles would also be loaded in a fraction of a second:

Cesium Subtree Loading Faster

So the time overhead that was caused by the expandSubtree call (mentioned in the previous comment) might only really kick in in these unfortunate situations where many subtrees are loaded at once.

All this is still pretty exploratory. More focussed steps for mitigating this issue might be derived from real-world data and real-world test/benchmark runs. Right now, the expandSubtree call is still a hot candidate for optmizations, and the issue of "loading many subtrees at once" might just be a 'multiplier' here.

bertt commented 2 months ago

FYI I've fixed the original Sandcastle from forum post https://community.cesium.com/t/black-screen-on-loading-larger-tileset-with-implicit-tiling/21460

Sandcastle link

BTW in that demo there is only 1 larger subtree file, so no tree of subtree files

CesiumGS / cesium

Improve lazy loading of implicit subtrees #10939