CesiumGS / cesium-native

Apache License 2.0
438 stars 215 forks source link

Assistance In Implementing A New Frontend #823

Open netshade opened 8 months ago

netshade commented 8 months ago

I'm trying to create a new frontend for cesium-native, piecing together how an implementer might use cesium-native on a new platform. In doing so, I've made it to the point where I am definitely able to submit geometries to the system, but the results are visually chaotic enough to suggest that I may have messed up part of the pipeline. I wanted to describe what I did in creating a new consumer of cesium-native in the hopes that someone could double check that I've implemented the minimum required to render map tiles on the screen, and potentially give me any advice when it comes to tryin to validate a new consumer of tiles.

My goal in this particular exercise is to fetch a given Tileset for a known area from the Google Photorealistic Tileset, and render it on VisionOS in RealityKit. I am attempting to take the GLTF meshes provided by cesium-native and format them appropriately as MeshDescriptor objects attached to ModelEntity objects in a Scene.

What I have done so far:

What I'd like to know is:

  1. Do the steps I've taken thus far seem like a reasonable way to begin integrating with Cesium Native? I've tried to put together what the process would be using the Cesium Unreal project and the test suites as a way to understand how to do this, but I definitely could believe I've missed some important setup bits. ( One that threw me early on was the importance of Cesium3DTilesContent::registerAllTileContentTypes() )
  2. Are there any not-obvious steps an integrator should ensure are followed when using Cesium Native as a tile / GLTF provider?
  3. When debugging early on, are there any tips you might give to someone in my shoes? For instance, in my case, while I'm definitely providing triangles to the Reality Kit renderer, it is all a bit of a mess atm, with many vertices seemingly stretching off into infinity with the texturing appearing off. I'm certain I've missed some small crucial transformation somewhere in the pipeline, but because right now I'm rendering many many tiles in the process of loading a tiles, it's difficult to isolate the simple case. When someone at Cesium is integrating, do you just get it perfect every time :-p or do you have an isolated simple case you use to test results before moving on to the "load the whole tileset" case?
  4. Right now, I'm more or less rendering a static scene, but I can already see that Cesium3DTilesSelection::ViewUpdateResult semantics will become extremely important in the sense of rendering a scene. Is there a simple explanation somewhere of what per-frame or per-user-action updates should be expected when interacting with a VIewUpdateResult? Something on the order of "handle the different tile actions in this order in order to arrive at an interpretable frame"? Similarly, understanding what the desirable semantics are for the IPrepareRendererResources allocate v. free methods are would be useful. I kind of assume that because it says free, it means free with respect to "resources will definitely not be necessary anytime soon unless the ViewState brings it back into view". I'm twitchy on this because in my simplistic "load one location" test case, I loaded something on the order of 1,000 tiles and immediately freed half of them, which took me by surprise given I was loading at a static location.

Thanks for any guidance on this. I realize this is hardly an "issue" and more of a request for advice, but I thought it would be useful to surface this publicly for anyone else that might be integrating.

kring commented 8 months ago

I haven't analyzed this in full detail, but one thing that jumps out at me from a quick read is this:

For all TileLoadState::Done tiles in the tilesToRenderThisFrame collection of the update result, I fetch the getRenderContent() data. This corresponds with the handles created by my IPrepareRendererResources implementor. Using these handles, I can instruct my RealityKit rendering side to fetch all corresponding ModelEntity objects and place them in my scene.

That approach will cause you to render multiple levels-of-detail simultaneously, which will be a mess. Plus extra tiles that are loaded for caching purposes but don't actually need to be rendered. The solution is to use the ViewUpdateResult. When you initially create renderer resources in IPrepareRendererResources, you need to make sure they're not visible (rendered). Then, after the call to updateViewOffline, you should:

  1. Hide (stop rendering) all of the tiles listed in tilesFadingOut, and
  2. Show (start rendering) all of the tiles listed in tilesToRenderThisFrame.

Step 1 is unnecessary if you're only doing one call to updateViewOffline, but it'll be necessary once you start updating the tile selection as the camera moves.

If that's not the problem, it'd be helpful to see screenshots of what you're rendering looks like, because it might provide a clue. In fact, we'd love to see screenshots if you get it working, too! :)

netshade commented 8 months ago

Awesome, thank you so much for the response. :) Do I interpret you correctly in that tilesToRenderThisFrame may include multiple LOD tiles? eg, it is my responsibility as the rendering entity to ensure that only some subset of tiles in tilesToRenderThisFrame are rendered? ( It may be helpful to note at this point that, for what I'm doing, I'm essentially doing "collect all tiles in tilesToRenderThisFrame and populate a single scene with them - this scene is static after a single call to updateViewOffline, and I am attempting no further view updates at this point. )

Thank you for the note on ordering with tilesFadingOut and tilesToRenderThisFrame, that is super helpful.

As far as screenshots, I am embarrassed to be showing this to professionals, but, here is a screenshot of the scene before rendering the tile geometries, with a teapot at 0, 0, -1 world position for reference:

Screenshot 2024-03-03 at 9 30 45 PM

and here is what it looks like after the geometry renders in the scene:

https://github.com/CesiumGS/cesium-native/assets/3809/4490be0b-54ca-4509-8f68-8f5a71aba326

What I've does here is place an Entity ( essentially what I understand to be a graph node in RealityKit parlance ) at the world position 0, 0, -1, and all the tile geometries are children of that node. My presumption would be that all tiles would be presented such that I would be looking at them as if I was hovering 200m above the surface of the desired lat/lng. Instead it seems like I'm in the middle of those geometries, and they are weirdly placed with respect to each other. The texturing also seems off, but at the moment I'm sort of leaving that to be a separate problem.

The actual colored lines you see in that movie are artifacts of a debugging facility that is part of RealityKit ( Surfaces, Occlusion lines, etc. all can be rendered ) so you can see that I'm still in the test "room" that is used to simulate an AR session.

kring commented 8 months ago

Do I interpret you correctly in that tilesToRenderThisFrame may include multiple LOD tiles?

No, sorry for the confusion, tilesToRenderThisFrame will only have the actual tiles that you should render. There won't be any conflicting LODs in there. I think I misread your original statement and thought you were walking through all the loaded tiles (i.e. forEachLoadedTile), not just the tilesToRenderThisFrame.

My presumption would be that all tiles would be presented such that I would be looking at them as if I was hovering 200m above the surface of the desired lat/lng.

The coordinate system of the tiles - in almost all cases (certainly for Google Photorealistic 3D Tiles) - is ECEF. The origin is the center of the Earth. Actually, each tile has geometry relative to its own local coordinate system, and a transformation that takes that to ECEF. Perhaps you're not including the tile's transformation (https://github.com/CesiumGS/cesium-native/blob/c9cf5430d5a1d5a2bbdf667169e2c47316f7a8b0/Cesium3DTilesSelection/include/Cesium3DTilesSelection/Tile.h#L344) when rendering it in the world?

If you want to render the world with a different coordinate system, not at the center of the Earth LocalHorizontalCoordinateSystem may help. Basically you construct an instance of that the way you like, call getEcefToLocalTransformation, and multiply that with the tile's transformation. Use the result as your model matrix when rendering the tile. Something like:

glm::dmat4 modelMatrix = localHorizontal.getEcefToLocalTransform() * tile.getTransform();
timoore commented 8 months ago

As someone who was in your shoes a little over a year ago, first let me congratulate you for taking this on. It's rather intimidating to be staring at TilesetExternals, figuring out where to start.

With what you've done so far, it's hard to know if your translation from cesium-native's glTF to the RealityKit representation is working correctly. Can you change your scene so that you're looking at the earth from 12k+ kilometers away? From that viewpoint it's quickly obvious if you have something that looks Earth-like or not. Or, as what happened when I first tried Google Photorealistic Tiles, you might see octants of the Earth weirdly rotated but placed correctly. It's also progress to see a white or black globe without textures; at least you know that you're getting somewhere.

There's also a fairly new entry point GltfReader::loadGltf() which can be used to load glTF models or even individual tiles out of a 3D Tiles 1.1 tileset (if you can get at the glb files) and see if that translation is working correctly. It's not dead-simple to use that function as it returns a future, but you are already an old hand at dealing with that.

I'll also suggest without modesty that looking at https://github.com/timoore/vsgCs might be helpful because the target system, Vulkan Scene Graph, is much simpler than Unreal.

netshade commented 8 months ago

Ooh, excellent @kring , thank you for advice on LocalHorizontalCoordinateSystem.

And thanks so much @timoore I will check GltfReader and vsgCs out both. Really appreciate the advice.

netshade commented 8 months ago

@timoore @kring I've made some good progress on using GltfReader as my entry into the pipeline, in concert with several glb and gltf files from the Cesium Sandcastle. This lets me have a much better cycle time on debugging my integration, rather than going straight to loading tiles via Cesium Ion. Thanks so much for the advice. If you're not opposed, I'll continue adding information on this issue just as a way to imperfectly document the process with anyone else that might be working in this space.

netshade commented 7 months ago

So I wanted to check back in here and say that largely I'm seeing what I'd "like" in general, very much in part thanks to both of your advice. vsgCs was super helpful in navigating what seem to be some implicitly desirable actions ( notably, these ), as well as just cross checking what I was up to with regards to what you've done. LocalHorizontalCoordinateSystem has also been extremely useful, though now I'm encountering something surprising to me that may be related?

Currently, I am now rendering regions centered at a given lat, lng and height. I construct a view request at a given Cartographic, where the view focus is the lat, lng and height, and the viewer position is roughly 500m in the air "above it". The desired effect here is to render a set of tiles "laid out" on to a flat plane, as if you were viewing a topographic map on a table.

What I have found is that, when providing elevation values ( acquired via the Google Elevation API ) for both the tile request and the local horizontal coordinate system, I frequently see the tiles "hovering" or "below" the origin, but never "at" the origin as I would expect the LocalHorizontalCoordinateSystem to behave. The fact that behavior changes for different lat lngs make me suspect that the elevation values I'm providing to LocalHorizontalCoordinateSystem are not "correct" - that is to say, the height values I believe are from the WGS84 geoid, and perhaps Cesium and the tiles expect that the height value is instead the expected to be of the ellipsoid? And that it is on me to find a way to compute the ellipsoidal-to-geoid difference term somehow? ( Or maybe I should be saying provide rather than compute :-p )

Thanks for any assistance. I expect you must indirectly have to correct a lot of ignorance on the part of users understanding coordinate systems, and it's truly appreciated.

netshade commented 7 months ago

As a side note, seeing the Google Photorealistic tiles rendered via the VisionOS renderer is quite cool. Screenshots don't really do it justice, the rendering makes the world feel like a toy in front of you.

timoore commented 7 months ago

So I wanted to check back in here and say that largely I'm seeing what I'd "like" in general, very much in part thanks to both of your advice. vsgCs was super helpful in navigating what seem to be some implicitly desirable actions ( notably, these ), as well as just cross checking what I was up to with regards to what you've done. LocalHorizontalCoordinateSystem has also been extremely useful, though now I'm encountering something surprising to me that may be related?

Currently, I am now rendering regions centered at a given lat, lng and height. I construct a view request at a given Cartographic, where the view focus is the lat, lng and height, and the viewer position is roughly 500m in the air "above it". The desired effect here is to render a set of tiles "laid out" on to a flat plane, as if you were viewing a topographic map on a table.

What I have found is that, when providing elevation values ( acquired via the Google Elevation API ) for both the tile request and the local horizontal coordinate system, I frequently see the tiles "hovering" or "below" the origin, but never "at" the origin as I would expect the LocalHorizontalCoordinateSystem to behave. The fact that behavior changes for different lat lngs make me suspect that the elevation values I'm providing to LocalHorizontalCoordinateSystem are not "correct" - that is to say, the height values I believe are from the WGS84 geoid, and perhaps Cesium and the tiles expect that the height value is instead the expected to be of the ellipsoid? And that it is on me to find a way to compute the ellipsoidal-to-geoid difference term somehow? ( Or maybe I should be saying provide rather than compute :-p )

Your explanation is quite likely correct. Mapping elevations are in MSL i.e., the height above the geoid or something close to it. LocalHorizontalCoordinateSystem refers to the WGS84 ellipsoid, not the same thing.

There is currently a pull request, #835 , that provides a function to get the height above the ellipsoid of the EGM96 geoid, a standard is somewhat old and low-res model of global sea level. You could merge that branch in and try it out.

timoore commented 7 months ago

As a side note, seeing the Google Photorealistic tiles rendered via the VisionOS renderer is quite cool. Screenshots don't really do it justice, the rendering makes the world feel like a toy in front of you.

Nevertheless, we'd love to see screenshots and/or vids. It's quite an accomplishment to go from nothing to garbage to real 3D terrain. Congratulations!

netshade commented 7 months ago

This was exactly it. :) The Google Elevation API in concert w/ the ellipsoidal correction in EGM96 has put everything within the same rough understanding of origin. Thanks so much.

netshade commented 7 months ago

Just adding a note here about IPrepareRendererResources - when I first encountered it, I went in with a very prescriptive idea of what loadThread and mainThread meant. The fact that updateViewOffline existed made me think the two methods were exclusive to each other in functionality - that is to say, prepareInLoadThread would be called with updateView and prepareInMainThread would be called with updateViewOffline. When I learned what I was wrong about here, I then assumed that tiles in prepareInLoadThread would have all the resources necessary to render to screen, separate from prepareInMainThread. This also was wrong, as I learned that not all textures had been loaded at this point, such that it now became clear to me that an implementer should:

While I see why these two methods are named what they are, I think it could be helpful to consider either documenting the semantic differences between what these two methods expect to happen, or consider renaming them to make the intent a bit more clear ( I supply these as examples of what I mean, but fully admit I do not think they are great examples: prepareMeshCoordinates v. prepareInLoadThread and prepareMeshTextures v. prepareInMainThread. )

netshade commented 1 month ago

I realized I never actually posted a video of what I got working here :-P For anyone else that comes along, here's Cesium Native working on VisionOS

https://github.com/user-attachments/assets/697924f2-b450-4873-80d2-cd7b8fa31f17

timoore commented 1 month ago

Very cool!