CesiumGS / cesium-unity

Bringing the 3D geospatial ecosystem to Unity
https://cesium.com/platform/cesium-for-unity/
Apache License 2.0
358 stars 83 forks source link

SqlLiteCache Memory Leak when using Netcode For GameObjects #397

Closed Reag closed 9 months ago

Reag commented 9 months ago

My product has a bit of an unusual use case for Tilesets. Specifically, we're using it as the backing for displaying digital twins in VR in a shared meeting room. To back this, we implemented Netcode For GameObjects (NGO) to handle syncing the client and server scene. Because this project predates the Cesium-Unity repo, we implemented the digital twins with the very outdated Nasa-Tiles library.

Recently, I had the free time to attempt to replace our outdated library with the Cesium one. After a few hours, we successfully built a working prototype! However, when we began local testing, we immediately ran into problems when using multiple Unity Clients on the same machine.

It would appear that the SqlLiteCache is locked to the specific Unity Instance that ran first on the machine. This means that future unity clients running cannot access tiles at all. We observed the following error in the Player.log:

09:21:12.556 [Log] : [2024-01-29 09:21:12.557] [error] [SqliteCache.cpp:441] database is locked
UnityEngine.StackTraceUtility:ExtractStackTrace () (at C:/build/output/unity/unity/Runtime/Export/Scripting/StackTrace.cs:37)
UnityEngine.DebugLogHandler:LogFormat (UnityEngine.LogType,UnityEngine.Object,string,object[])
_Absorber.Core.Utility.Logger.AbsorberLogger:LogFormat (UnityEngine.LogType,UnityEngine.Object,string,object[]) (at C:/DEV/proj/src/OverSight_Unity/Assets/_Absorber/Core/Utility/Logger/AbsorberLogger.cs:128)
UnityEngine.Logger:Log (UnityEngine.LogType,object)
UnityEngine.Debug:Log (object)
Reinterop.ReinteropInitializer:UnityEngine_Debug_CallLog_FA05wu8x__otZNsgdHTnU9A (intptr) (at ./Library/PackageCache/com.cesium.unity@1.7.1/Runtime/generated/Reinterop/Reinterop.RoslynSourceGenerator/ReinteropInitializer.cs:58900)

(Filename: ./Library/PackageCache/com.cesium.unity@1.7.1/Runtime/generated/Reinterop/Reinterop.RoslynSourceGenerator/ReinteropInitializer.cs Line: 58900)

And the following memory leak: image

While this error probably wont appear in any reasonable production environment, it does mean that our developers cannot test builds locally. It also means some of our integration tests will fail. Would it be possible for this Sql cache to be open on the computer, so various clients could access it? It might even allow for interesting local optimizations to get around unity restrictions, like a separate C++ program that calculates screen space error and traversal outside of the unity execution context!

I eagerly look forward to your response, as the Cesium-Unity project is infinitely faster than our current outdated solution, and I would love to be able to port over.

kring commented 9 months ago

@Reag I suggest you just ignore the "databse is locked" errors. They'll prevent the request cache from working for whatever process runs second, but this is harmless beyond the performance impact. A single SQLite database is not meant to be shared across processes, and the cost of fixing this would likely far outweigh the benefit.

I don't know of any reason that would also lead to a memory leak, so I suspect it's unrelated. Does it only occur run two copies of your application at once?

kring commented 9 months ago

I just read your message again and saw where you said the tiles aren't loading at all in the second process. That's very strange indeed. Can you reproduce that by running two copies of the Cesium for Unity Samples project as well, or only with your own project?

Reag commented 9 months ago

Interestingly enough, it doesn't happen in the samples. How odd. Ill do some more research and see if I can isolate why it happens in our scene (where we simply build a geo reference and add a cesium 3d tileset component) and not in the samples. The sample project also doesn't have the sql error either. If there's any information I can provide that would be helpful in isolating this, please let me know.

[Update]: I cant reproduce it consistently. I was able to cause it to freeze with one tileset for a small amount of time, before it seems to have recovered. Perhaps it has something to do with our stress test, which loads around 70 different tilesets, each ranging between 5 and 100 gig.

Reag commented 9 months ago

Managed to track the problem down. We had a network RPC rippling across the code that would enable and disable tilesets under certain conditions. This caused various tilesets to set their game object state to active and inactive fairly rapidly. As soon as this RPC was removed, the SQL error stopped appearing. It would seem that this issue is caused when a lot of tilesets are toggled on and off rapidly.

In our case this was an overly aggressive pruning algorithm that we used for the older outdated library we where using. Outside of some pathological use-case, no unity app should need to toggle enough tilesets to cause this bug. As such, Im going to close this issue as complete.