porisius / FicsitRemoteMonitoring

58 stars 12 forks source link

Crash after running application for a while. `Illegal call to StaticFindObjectFast()` #173

Open saffronjam opened 16 hours ago

saffronjam commented 16 hours ago

Hi again, reporting another crash I got today.

I am running version 1.3.2.

It's hard to say what to do really to replicate this, but I basically had my application running for about 30 minutes observing train logic. My application fetches about 10 different routes every other second:

      this.getSatisfactoryApiStatus(),
      this.getCircuits(),
      this.getFactoryStats(),
      this.getProdStats(),
      this.getSinkStats(),
      this.getPlayers(),
      this.getGeneratorStats(),
      this.getTrains(),
      this.getTrainStations(),

Here is the Stacktrace. let me know if you need more info

Version: 382498, IsEditor: No, IsPerforceBuild: No, BuildConfiguration: Shipping, Launcher: Steam, NetMode: Listen Server, IsUsingMods: Yes, IsSaveGameEdited: No

Fatal error: [File:C:\BuildAgent\work\9fc8da665efb5ffd\UE4\Engine\Source\Runtime\CoreUObject\Private\UObject\UObjectGlobals.cpp] [Line: 350] Illegal call to StaticFindObjectFast() while serializing object data or garbage collecting!

FactoryGameSteam_CoreUObject_Win64_Shipping!StaticFindObjectFast() [C:\BuildAgent\work\9fc8da665efb5ffd\UE4\Engine\Source\Runtime\CoreUObject\Private\UObject\UObjectGlobals.cpp:354]
FactoryGameSteam_CoreUObject_Win64_Shipping!StaticFindObject() [C:\BuildAgent\work\9fc8da665efb5ffd\UE4\Engine\Source\Runtime\CoreUObject\Private\UObject\UObjectGlobals.cpp:493]
FactoryGameSteam_CoreUObject_Win64_Shipping!ResolveName() [C:\BuildAgent\work\9fc8da665efb5ffd\UE4\Engine\Source\Runtime\CoreUObject\Private\UObject\UObjectGlobals.cpp:1169]
FactoryGameSteam_CoreUObject_Win64_Shipping!StaticFindObject() [C:\BuildAgent\work\9fc8da665efb5ffd\UE4\Engine\Source\Runtime\CoreUObject\Private\UObject\UObjectGlobals.cpp:477]
FactoryGameSteam_CoreUObject_Win64_Shipping!FSoftObjectPath::ResolveObjectInternal() [C:\BuildAgent\work\9fc8da665efb5ffd\UE4\Engine\Source\Runtime\CoreUObject\Private\UObject\SoftObjectPath.cpp:593]
FactoryGameSteam_CoreUObject_Win64_Shipping!FSoftObjectPath::ResolveObjectInternal() [C:\BuildAgent\work\9fc8da665efb5ffd\UE4\Engine\Source\Runtime\CoreUObject\Private\UObject\SoftObjectPath.cpp:588]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!UFRM_Power::getGenerators() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\FicsitRemoteMonitoring\Private\FRM_Power.cpp:236]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!AFicsitRemoteMonitoring::getGenerators() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\FicsitRemoteMonitoring\Public\FicsitRemoteMonitoring.h:494]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!AFicsitRemoteMonitoring::CallEndpoint() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\FicsitRemoteMonitoring\Private\FicsitRemoteMonitoring.cpp:822]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!AFicsitRemoteMonitoring::HandleEndpoint() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\FicsitRemoteMonitoring\Private\FicsitRemoteMonitoring.cpp:865]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!AFicsitRemoteMonitoring::HandleApiRequest() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\FicsitRemoteMonitoring\Private\FicsitRemoteMonitoring.cpp:571]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!``AFicsitRemoteMonitoring::StartWebSocketServer'::`2'::<lambda_2>::operator()'::`3'::<lambda_10>::operator()<uWS::HttpResponse<0> >() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\FicsitRemoteMonitoring\Private\FicsitRemoteMonitoring.cpp:311]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!ofats::any_detail::handler_traits<bool,uWS::HttpRouter<uWS::HttpContextData<0>::RouterData> *>::large_handler<`uWS::HttpContext<0>::onHttp'::`2'::<lambda_1> >::call() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\ThirdParty\uWebSockets\MoveOnlyFunction.h:152]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!uWS::HttpRouter<uWS::HttpContextData<0>::RouterData>::executeHandlers() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\ThirdParty\uWebSockets\HttpRouter.h:188]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!``uWS::HttpContext<0>::init'::`2'::<lambda_3>::operator()'::`2'::<lambda_1>::operator()() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\ThirdParty\uWebSockets\HttpContext.h:177]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!uWS::HttpParser::fenceAndConsumePostPadded<0>() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\ThirdParty\uWebSockets\HttpParser.h:514]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!uWS::HttpParser::consumePostPadded() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\ThirdParty\uWebSockets\HttpParser.h:698]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!`uWS::HttpContext<0>::init'::`2'::<lambda_3>::operator()() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\ThirdParty\uWebSockets\HttpContext.h:143]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!us_internal_dispatch_ready_poll() [A:\workspace\vcpkg\buildtrees\usockets\src\v0.8.8-af4482ba56.clean\src\loop.c:358]
uv
uv
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!`AFicsitRemoteMonitoring::StartWebSocketServer'::`2'::<lambda_2>::operator()() [C:\SatisfactoryModLoader\Mods\FicsitRemoteMonitoring\Source\FicsitRemoteMonitoring\Private\FicsitRemoteMonitoring.cpp:334]
FactoryGameSteam_FicsitRemoteMonitoring_Win64_Shipping!TAsyncRunnable<void>::Run() [C:\ue\Engine\Source\Runtime\Core\Public\Async\Async.h:457]
FactoryGameSteam_Core_Win64_Shipping!FRunnableThreadWin::Run() [C:\BuildAgent\work\9fc8da665efb5ffd\UE4\Engine\Source\Runtime\Core\Private\Windows\WindowsRunnableThread.cpp:149]
cosmopetrich commented 14 hours ago

The code which triggered it appears to be this is here: https://github.com/porisius/FicsitRemoteMonitoring/blob/0853a47beeb018d9578e5834bfb917618da6766f/Source/FicsitRemoteMonitoring/Private/FRM_Power.cpp#L234-L236

As the error implies this might have occurred either because the game was saving a related object at the time or because the garbage collector happened to be pruning one of the unused fuel classes. Possibly it's also just because the fuel class wasn't loaded at the time?

Either way it definitely sounds like it might be hard to reproduce.

I haven't used UE Soft Pointers before, so I'm not sure how we're expected to use them. Maybe Get() isn't supposed to work if the underlying class isn't loaded and we're meant to call LoadSynchronous()?

Th3Fanbus commented 14 hours ago

Hmmmmm, looks like we have to cache the result of GetDefaultFuelClasses into a UPROPERTY in advance (on the game thread) or else we run into issues.

porisius commented 14 hours ago

Is this in any relation to, I think it was getFactory, that we did something with the recipes?

I think we ended up speeding up that endpoint by leaps and bounds because of that.

cosmopetrich commented 14 hours ago

Hmmmmm, looks like we have to cache the result of GetDefaultFuelClasses into a UPROPERTY in advance (on the game thread) or else we run into issues.

Possibly yeah. There's also TStrongObjectPtr as an alternative to UPROPERTY. However on UE 5.3 it can't be used outside the main thread. I haven't looked at how uwebsockets handles requests but I assume that we won't be in the main thread when running FRM HTTP handlers.

It's also strange to me that UE doesn't use one of the "safe" methods of FindObject for this. However it uses the regular non-safe method even on 5.5, so it's probably just a gap in my understanding.

Th3Fanbus commented 14 hours ago

Oh, we just need to stop the result of GetDefaultFuelClasses from being GC'd. In this case I think it's because the soft pointers can be resolved but the objects in question are about to get GC'd.

Is this in any relation to, I think it was getFactory, that we did something with the recipes?

I think we ended up speeding up that endpoint by leaps and bounds because of that.

I don't remember but I know I switched this code away from GetAvailableFuelClasses because of performance concerns as well as crashes. The latter still happen, mind you, but I understand why.

cosmopetrich commented 13 hours ago

If we did cache the fuel types on startup then we'd need to do so once for each generator type, wouldn't we? Biomass/coal/fuel plants will all have different values even without dealing with modded generators.

To me, it still feels like something is odd in how we're using the soft pointer.

SoftFuelClass.Get() is the first time we are dealing with the underlying UObject within this scope. It's not like we're finding the object, doing some other stuff, then coming back to find that it's being GC'd. It's also not like we're trying to find the object and getting a null response. We're saying "Hey Unreal, there's this thing that you're telling us can exist but might not exist right now. Can we have one of it please?" and Unreal is replying "NO YOU CAN'T EVEN ASK ME FOR THAT RIGHT NOW IT'S ILLEGAL. I'M SHUTTING EVERYTHING DOWN."

So it seems like either:

While caching/memoising the fuel types would make this crash far less likely since we'd be calling SoftFuelClass.Get() one time rather than (potentially) thousands over the course of a gameplay session, I imagine the same problem could still arise if FRM was started up at the wrong time.

porisius commented 13 hours ago

Something in the headers... it is a UPROPERTY, but protected...

/** Kept for save game compatibility. @see mDefaultFuelClasses */
UPROPERTY()
TArray< TSubclassOf< class UFGItemDescriptor > > mFuelClasses_DEPRECATED;

/** Fuel classes this machine can run on. */
UPROPERTY( EditDefaultsOnly, Category = "Power", meta = ( MustImplement = "FGInventoryInterface" ) )
TArray< TSoftClassPtr< UFGItemDescriptor > > mDefaultFuelClasses;
Th3Fanbus commented 13 hours ago

It's an array of soft class pointers, those do not prevent the classes from being GC'd. I believe this can happen for BP classes (they're unloaded from memory if they're not used)

cosmopetrich commented 13 hours ago

Something in the headers... it is a UPROPERTY, but protected...

I think that's the same as what we're already using. GetDefaultFuelClasses() is defined as

[...] { return mDefaultFuelClasses; }

Re LoadSynchronous() possibly being preferable to Get():

Archengius in the SF Modding Discord:

Use TSoftClassPtr instead, it works the same way but does not cause actual object loading. You can load the object when you actually need it by using LoadSynchronous(), or you can just Get() it if you know it is already in memory

This Reddit comment:

When using a soft reference inside C++, you can call LoadSynchronous().

porisius commented 8 hours ago

@saffronjam I might have a crazy idea, would it be possible to have a copy of your save to try to replicate as many conditions as it took to crash?

I am going to run a PS Script against it and see how long to replicate, and see if a fix I have works...

PowerShell for reference:

# Declare Endpoints
$Endpoints = @(
    "http://localhost:8080/api/getCircuits",
    "http://localhost:8080/api/getFactoryStats",
    "http://localhost:8080/api/getProdStats",
    "http://localhost:8080/api/getSinkStats",
    "http://localhost:8080/api/getPlayers",
    "http://localhost:8080/api/getGeneratorStats",
    "http://localhost:8080/api/getTrains",
    "http://localhost:8080/api/getTrainStations"
)

# Declare for While Loop
$NotCrashed = $true

# While the game hasn't crashed
while ($NotCrashed) {

    # Wait Every Other Second
    Start-Sleep -Seconds 2

    # Poll Endpoint
    foreach ($Endpoint in $Endpoints) {

        # Call URI
        $Response = Invoke-WebRequest -Uri $Endpoint

        # If Not HTTP 200 (Please do not use getCoffee in this)
        if ($Response.StatusCode -ne 200) {
            $NotCrashed = $false
        }

    }
}
Th3Fanbus commented 4 hours ago

@cosmopetrich I replied to two of your messages, and proposed a way to handle this.

To me, it still feels like something is odd in how we're using the soft pointer.

Nothing stops the underlying object from being GC'd.

SoftFuelClass.Get() is the first time we are dealing with the underlying UObject within this scope. It's not like we're finding the object, doing some other stuff, then coming back to find that it's being GC'd. It's also not like we're trying to find the object and getting a null response. We're saying "Hey Unreal, there's this thing that you're telling us can exist but might not exist right now. Can we have one of it please?" and Unreal is replying "NO YOU CAN'T EVEN ASK ME FOR THAT RIGHT NOW IT'S ILLEGAL. I'M SHUTTING EVERYTHING DOWN."

We're in a non-game thread. The GC runs in sync with the game thread (game thread will never have an UObject get GC'd while it's being used). What happened there is that an object got GC'd while a non-game thread was trying to use that object.

So it seems like either:

* We should be doing some kind of check before calling Get() to make sure we can safely call Get().

That is impossible. The check could return true only for Get() to blow up anyway (because multithreaded nondeterminism).

* Get() isn't the right thing to be calling for this use case, and it just happens to work most of the time because the fuel classes are already being loaded by something else.

It's the right thing to be calling, but the rest is correct: it works as long as something else has loaded the classes (objects).

While caching/memoising the fuel types would make this crash far less likely since we'd be calling SoftFuelClass.Get() one time rather than (potentially) thousands over the course of a gameplay session, I imagine the same problem could still arise if FRM was started up at the wrong time.

Instead of memoisation, I suggest simply loading the fuel classes in advance (see below). Then we can use Get() without worrying.

Re LoadSynchronous() possibly being preferable to Get():

Archengius in the SF Modding Discord:

Use TSoftClassPtr instead, it works the same way but does not cause actual object loading. You can load the object when you actually need it by using LoadSynchronous(), or you can just Get() it if you know it is already in memory

This Reddit comment:

When using a soft reference inside C++, you can call LoadSynchronous().

Loading objects is only possible from the game thread. LoadSynchronous() will attempt to load an object from the current thread, and the crash happened in a non-game thread.

The reason I used Get() in https://github.com/porisius/FicsitRemoteMonitoring/commit/84f8692d99a996da8d8427ba3d1fe1b0a96cc462 is because calling Get() is possible from a non-game thread. However, it blows up when the object is about to be GC'd.

If we did cache the fuel types on startup then we'd need to do so once for each generator type, wouldn't we? Biomass/coal/fuel plants will all have different values even without dealing with modded generators.

Yes, but this is "relatively" simple to do: from a game instance/world module, use https://github.com/Th3Fanbus/ScrewIt/blob/master/Source/ScrewIt/Private/Screwer.cpp#L16 to find all subclasses of generators; for each, get its CDO, get the default fuel classes, use LoadSynchronous() (we're in the game thread) to get a strong reference to the fuel classes, and store them in a TSet<> member variable that is marked as UPROPERTY().

No, there's no need to group these by generator class or anything. Having a strong reference stored inside a UPROPERTY() is enough to stop the fuel classes from getting GC'd so we can keep using Get() from non-game threads (for default fuel classes).