CesiumGS / cesium-unity

Bringing the 3D geospatial ecosystem to Unity
https://cesium.com/platform/cesium-for-unity/
Apache License 2.0
347 stars 83 forks source link

Height query API for Unity #507

Closed kring closed 2 days ago

kring commented 6 days ago

Builds on CesiumGS/cesium-native#783 so merge that first. This is the Unity equivalent of CesiumGS/cesium-unreal#1421.

Note the small Reinterop change to support the C# params keyword, which lets you call a method taking an array as if it had an arbitrary number of distinct parameters.

j9liu commented 3 days ago

Well... clang-format isn't installing on CI for some reason, which isn't great 😬 I'm not sure what my changes had to do with it, but here's the error log:

Run pip install clang-format
error: externally-managed-environment

× This environment is externally managed
╰─> To install Python packages system-wide, try apt install
    python3-xyz, where xyz is the package you are trying to
    install.

    If you wish to install a non-Debian-packaged Python package,
    create a virtual environment using python3 -m venv path/to/venv.
    Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
    sure you have python3-full installed.

    If you wish to install a non-Debian packaged Python application,
    it may be easiest to use pipx install xyz, which will manage a
    virtual environment for you. Make sure you have pipx installed.

    See /usr/share/doc/python3.12/README.venv for more information.

note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
hint: See PEP [6](https://github.com/CesiumGS/cesium-unity/actions/runs/11059061896/job/30726561913#step:3:7)68 for the detailed specification.
j9liu commented 3 days ago

Hm... and unfortunately I'm running into this error when I'm testing: image

I believe I saw a TODO for this -- is it addressable before the release?

kring commented 3 days ago

I think that creating an async task is not intuitive to the beginner Unity user

Async is never easy, but Unity coroutines should make this pretty easy compared to what users are faced with in cesium-native or cesium-unreal. The main trick is to realize you can wrap the task in WaitForTask and then yield return it. See the tests for an example of this.

I think the latest versions of Unity may support using async/await for coroutines, in which case this would be even easier (but I haven't tried it).

kring commented 3 days ago

I went on quite a debugging odyssey here, so I'm going to write it down for posterity...

I was trying to handle the case that the user calls SampleHeightMostDetailed before the cesium-native Tileset is created. So my first attempt was to do what I did in Unreal: if the cesium-native Tileset isn't created yet, use a runInMainThread to delay until after the next Update (where the Tileset gets created) and check again.

But this didn't work at all. The Tileset was still null inside the runInMainThread. And so the debugging began.

The first thing I noticed was that the runInMainThread was being called immediately (at least in some cases), rather than delaying a frame as I expected. Now, that might not sound too surprising. After all, we are guaranteed to be running in the main thread here, and runInMainThread intentionally invokes its callback immediately if you're already in the main thread.

BUT I happen to know how that mechanism works, and I knew it shouldn't be working in this case, which is why I thought it safe to use a runInMainThread there because it wouldn't be invoked immediately. See how I get myself into trouble? Basically, the AsyncSystem only considers us to be "in the main thread" if we're already executing a runInMainThread, thenInMainThread, or catchInMainThread continuation. By design, it doesn't actually look at what thread it's in or anything like that.

SampleHeightMostDetailed, being the implementation of a C# function, is not called from a continuation. It's called from Unity itself. So even though we're in the main thread, the runInMainThread should add the callback to the queue and invoke it later. But no, I was instead seeing it invoked right away. What the....!

So of course I confirmed my understanding about when we're considered "in the main thread" (it was correct), and then started digging into how we determine whether we're inside one of the main thread continuations. Conceptually it's pretty simple. Just before we call the continuation, we push a thing onto a thread local vector (a stack). Then we call the continuation. Then we pop the thing off the thread local stack. We can determine we're inside a main thread continuation by whether the thing is on the stack. It turned out that every time I ran the tests, there would be another copy of the thing on the stack. So it seemed we were somehow failing to pop the thing off the stack.

While conceptually simple, this whole thing is a little tricky beause of this thread-localness, and I thought that's where it was going wrong. I tried a bunch of things to address some potentially dodgy aspects of it. For example, the thread local variable is declared in a static function in a template class, which is defined in a header. Were we somehow getting a separate copy of the stack per compilation unit due to inlining? It seemed a plausible theory, but fixing that did not make the problem go away.

Eventually with lots of debugging, I realized that there was a managed C# exception being thrown at the critical moment. Unity likes to reload AppDomains a lot (especially while running unit tests), which effectively destroys every C# object. However, our native code sometimes still has a reference to these C# objects. When the native code tries to do anything with these destroyed objects (including simply release our reference to them), .NET throws an ArgumentException with a message like GCHandle value belongs to a different domain. This happens often enough that I guess I've gotten in the habit of ignoring it.

Now what I didn't fully appreciate (even though I now realize that I knew this two years ago and wrote it down in #18!) is what happens when that C# exception gets thrown. The exception propagates up the C# call stack, as usual. But in this case, the C# code was invoked by our native code. There's no mechanism for turning a C# exception into a C++ one, so the C# exception goes straight through the C++ code stack without properly unwinding it 😱! No catch or or finally blocks are called. No RAII destructors are called.

So yeah this is a disaster for most any C++ code. In this particular case, the RAII that was meant to say "we're leaving the continuation in which we know we're in the main thread" was never being called. So the "thing" stayed on the stack. And we were always considered to be in the main thread (on that thread) from then on.

kring commented 3 days ago

Because the general solution to the above (#18) is fairly involved, I hackily worked around it by catching this particular C# exception rather than letting it blow up our C++ call stack. With that, the runInMainThread reliably deferred its callback every time.

However, SampleHeightMostDetailed still was not working right. The runInMainThread callback was still running before the tileset's Update method was called and therefore before the cesium-native Tileset was created when running multiple tests. It worked fine when running each test individually.

More debugging. This time the problem was more mundane. Test A creates Cesium3DTileset-A, does its thing, and the test passes. Test B creates Cesium3DTileset-B and calls SampleHeightMostDetailed on it, which schedules a task with runInMainThread. Next frame, Unity calls Update on Cesium3DTileset-A, which dispatches the runInMainThread scheduled by Cesium3DTileset-B's SampleHeightMostDetailed. Cesium3DTileset-B's Update hasn't been invoked yet, so the Native Tileset is still null.

We could.... wait two frames?

I decided to make SampleHeightMostDetailed just trigger LoadTileset directly instead. That caused some new problems, related to the destroyTilesetOnNextUpdate mechanism used to work around the need to refresh the tileset in the OnValidate event, even though Unity doesn't allow us to destroy game objects or components there. Easy enough to solve those, though.

(By the way, this can theoretically happen in Unreal, too. I need to check that.)

kring commented 2 days ago

This should be ready now.

j9liu commented 2 days ago

Thanks @kring for the fix -- sorry you had to go through that journey, but at least it works great now! 😅

Here's a snippet of the script I tried with the 04_CesiumSubScenes scene, just for my own future reference. I'm merging this now!


public class SampleHeights : MonoBehaviour
{
    public Cesium3DTileset terrain;

    // Start is called before the first frame update
    void Start()
    {
        StartCoroutine(SampleTilesetHeights());
    }

    public IEnumerator SampleTilesetHeights()
    {
        CesiumSamplesFlyToLocationHandler locationHandler = this.GetComponent<CesiumSamplesFlyToLocationHandler>();
        if (locationHandler == null)
        {
            yield return null;
        }

        if (terrain != null)
        {
            Task<CesiumSampleHeightResult> terrainTask =
                terrain.SampleHeightMostDetailed(locationHandler.locations[1], locationHandler.locations[2], locationHandler.locations[3]);
            yield return new WaitForTask(terrainTask);

            Debug.Log("Completed Terrain");

            CesiumSampleHeightResult result = terrainTask.Result;
            if (result != null)
            {
                foreach (string warning in result.warnings)
                {
                    Debug.Log(warning);
                }

                for (int i = 0; i < result.sampleSuccess.Length; i++)
                {
                    if (result.sampleSuccess[i]) {
                       GameObject cube = GameObject.CreatePrimitive(PrimitiveType.Cube);
                       cube.transform.parent = this.gameObject.transform;
                       CesiumGlobeAnchor anchor = cube.AddComponent<CesiumGlobeAnchor>();
                       anchor.longitudeLatitudeHeight = result.longitudeLatitudeHeightPositions[i];
                       cube.transform.localScale = new Vector3(50, 50, 50);
                    } 
                    else
                    {
                        Debug.Log("Could not sample terrain at given position.");
                    }
                }
            }
        }
        else
        {
            Debug.LogError("Terrain not set");
        }
    }
}