sinbad / SPUD

Steve's Persistent Unreal Data library
MIT License
308 stars 45 forks source link

SubscribeLevelObjectEvents hits "ensure" #9

Closed poettlr closed 2 years ago

poettlr commented 2 years ago

Hello @sinbad,

There seems to be an Issue in USpudSubsystem::SubscribeLevelObjectEvents(ULevel* Level). I don't know what's exactly wrong due to time constraints on my side but will investigate as soon as possible.

Here is a stack trace I was able to grab via BugSplat:

[2021.08.10-07.55.43:737][691]LogSpudState: Warning: Skipping restore level Map_Hub_P, no data (this may be fine)
[2021.08.10-07.55.43:737][691]LogOutputDevice: Warning: 

Script Stack (0 frames):

[2021.08.10-07.55.44:016][691]LogStats: FPlatformStackWalk::StackWalkAndDump -  0.278 s
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: === Handled ensure: ===
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: 
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: Ensure condition failed: InvocationList[ CurFunctionIndex ] != InDelegate [File:C:\BuildAgent\work\88f80d3383ccc93a\Engine\Source\Runtime\Core\Public\UObject/ScriptDelegates.h] [Line: 556]
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: 
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: Stack: 
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7ba2ce157 bg.exe!<lambda_7c081f6173193e48029482a76596c290>::operator()() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b55efc54 bg.exe!FSparseDelegateStorage::Add() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b4670de5 bg.exe!TSparseDynamicDelegate<FActorDestroyedSignature_MCSignature,AActor,FActorDestroyedSignatureInfoGetter>::__Internal_AddDynamic<USpudSubsystem>() [C:\BuildAgent\work\5fd5dbdab725171e\UnrealEngine\Engine\Source\Runtime\CoreUObject\Public\UObject\SparseDelegate.h:371]
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b46989ae bg.exe!USpudSubsystem::SubscribeLevelObjectEvents() [C:\BuildAgent\work\5fd5dbdab725171e\bg\Plugins\SPUD\Source\SPUD\Private\SpudSubsystem.cpp:791]
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b4684e23 bg.exe!USpudSubsystem::OnPostLoadMap() [C:\BuildAgent\work\5fd5dbdab725171e\bg\Plugins\SPUD\Source\SPUD\Private\SpudSubsystem.cpp:200]
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b467b63f bg.exe!TBaseUObjectMethodDelegateInstance<0,USpudSubsystem,void __cdecl(UWorld *),FDefaultDelegateUserPolicy>::ExecuteIfSafe() [C:\BuildAgent\work\5fd5dbdab725171e\UnrealEngine\Engine\Source\Runtime\Core\Public\Delegates\DelegateInstancesImpl.h:611]
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b85161e0 bg.exe!TMulticastDelegate<void __cdecl(UWorld * __ptr64),FDefaultDelegateUserPolicy>::Broadcast() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b8f19971 bg.exe!UEngine::LoadMap() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b8ec91b0 bg.exe!UEngine::Browse() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b8f38fc4 bg.exe!UEngine::TickWorldTravel() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b85fa430 bg.exe!UGameEngine::Tick() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b369edab bg.exe!FEngineLoop::Tick() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b36b3ecd bg.exe!GuardedMain() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b36b3f8a bg.exe!GuardedMainWrapper() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7b36c7670 bg.exe!WinMain() []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ff7ba021dde bg.exe!__scrt_common_main_seh() [D:\agent\_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288]
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ffba6717034 KERNEL32.DLL!UnknownFunction []
[2021.08.10-07.55.44:016][691]LogOutputDevice: Error: [Callstack] 0x00007ffba6e82651 ntdll.dll!UnknownFunction []

I can also provide a UE4MiniDump.dmp if that would help you.

Cheers, Nikolaus

sinbad commented 2 years ago

I can't tell very much from that other than you must be using an old version, because line 791 of SpudSubsystem.cpp isn't SubscribeLevelObjectEvents.

sinbad commented 2 years ago

Assuming your version is actually line 784 i.e. this:

   Actor->OnDestroyed.AddDynamic(this, &USpudSubsystem::OnActorDestroyed);

I believe the error is UE saying that this delegate has already been registered on that actor. That suggests the same level has been loaded twice, or at least that LoadMap() has been called twice with the same level.

poettlr commented 2 years ago

You are indeed right, I'm using an old version (will update as soon as possible). I check if we are actually loading a level twice maybe there is some sublevel that's referenced in a weird way or something similar.

Thanks for looking into that.

TI-asilva commented 2 years ago

Hello Steve,

I am having this same issue with latest code (i.e. my line 784 in SpuSubsystem.cpp is what you showed in your second comment). You are correct, it is loading the same level twice, however, in this case, it is default SPUD code that is requesting the second load in what seems like a reasonable usage pattern (and can be made to happen in the SPUDExamples project/codebase by either moving or deleting StreamLevel2Vol):

  1. I have a very similar setup to your SPUDExamples project however, just 2 rooms side by side and none of the tunnels/rooms/streaming volumes under the 2 main rooms. Each room has a streaming volume around it and there's one on a ramp that connects the two rooms, again, just like yours.
  2. I detach the camera using the console command ToggleDebugCamera. This creates a new spectator actor in the streaming volume, which causes a new relevant actor to be added, which, in turn, causes its level(s) to be requested to load in AddRelevantActor(), even though they are already loaded.
  3. Then I run into the ensure mentioned by the OP for everything in the level that is trying to add second handlers for their Destroy delegates.

As mentioned above, reproducing this is easy in the SPUDExamples level - the default setup does not trigger it because when you run the ToggleDebugCamera command, the new spectator actor overlaps StreamLevel2Vol instead of StreamLevel0, like you'd expect since that's where the character is when you first start PIE. But by simply moving that volume down a bit or deleting it so that the new spectator camera overlaps only StreamLevel0, this will happen 100% repro.

So it seems like maybe there needs to be some logic that checks to make sure a level isn't already loaded before requesting to load it again (maybe in ASpudStreamingVolume::AddRelevantActor()?).

Thank you for sharing a great foundational saving library! -Anthony

TI-asilva commented 2 years ago

Indeed, adding the following at the top of USpudSubsystem::LoadStreamLevel() fixes the issue:

auto StreamLevel = UGameplayStatics::GetStreamingLevel(GetWorld(), LevelName);
if (StreamLevel && StreamLevel->IsLevelLoaded())
{
    // this level is already loaded
    return;
}

Apologies on the formatting - evidently github.com does not have a text format (at least in the comment box) to insert code that respects whitespace. :-/

If this seems like a good change, I'd be happy to submit it as a PR if you are interested. Or if you have another idea or place to do the check, I could take a look at that as well.

Thanks, Anthony

sinbad commented 2 years ago

Hmm, the Request.Requesters.Num() == 1 check is supposed to deal with that but I'll have a look.

For future reference, GitHub uses markdown so you can use ``` on the lines before & after a code block to format the whole block as code rather than just an inline segment

sinbad commented 2 years ago

With the latest SPUDExamples I can't reproduce this. I can reproduce the debug camera causing a load request for StreamMap2 and StreamTransitionHall when StreamLevel2Vol is in its default position, as you described, but if I remove StreamLevel2Vol or move it out of the way, I get no second request for StreamMap0.

TI-asilva commented 2 years ago

That's odd. I will re-confirm and double-check that Requesters size check and see why it isn't catching in my case. Glad you could at least repro the debug camera being added to the wrong volume - is that supposed to happen or would that be considered a bug?

Thanks for the tip on the markdown!

sinbad commented 2 years ago

I don't think it's a bug, the debug camera's initial position must simply overlap with that other volume on instantiation.

TI-asilva commented 2 years ago

OK, still looking, but I think what's happening is that the ref counting in USpudSubsystem (or ASpudStreamingLevel or both) is maybe not working as expected.

When ASpudStreamingVolume calls AddRequestForStreamingLevel() from AddRelevantActor(), it always passes the streaming volume itself and not the actor that is now overlapping (it seems like uSpudSubsystem::AddRequestForStreamingLevel() is expecting the latter). It's the same for WithdrawRequestForStreamingLevel() in RemoveRelevantActor(). So Request.Requesters.Num() has a max of 1, no matter how many unique relevant actors enter the volume - and so every time that function gets called, it will load the level again. It seems like those calls should pass the relevant actor, not the volume since the add/remove stuff appears to be a ref counting system to determine when to load/unload the volumes.

However, one other gotcha is that in ASpudStreamingLevel::RemoveRelevantActor(), you check for RelevantActorsInVolume.Num() == 0 so that you only withdraw the request when all actors have exited the volume. So you get an unbalanced # of request adds vs. withdrawals since AddRelevantActor() adds as long as the new actor is unique. It's almost like the add flow is expecting USpudSubsystem to do the ref counting, whereas the remove is assuming that ASpudStreamingLevel is doing it.

TLDR; Basically, the ensure that started this thread happens when more than one relevant actor enter ASpudStreamingVolume because that causes a reload of a level that's already loaded since the volume is always passed in the add request.

I tried reproing the bug in the examples project by deleting StreamLevel2Vol and now I can't get it to happen there - so at least we're consistent there now. :-P I think the bug still stands, tho - the issue is due to having more than 1 relevant actor enter the volume. In my project, I can create that case by toggling the debug camera.

UPDATE: While the bug is still a bug, I have realized that my assumption that the actors passed to Add/WithdrawRequestForStreamingLevel() should be the overlap instigators was incorrect and that it should in fact be the volumes - i.e. multiple volumes requesting means that if overlap is ended in one volume, the level still stays loaded until all overlaps are ended. And so the bug is still when you have multiple load requests come through for the same level from the same volume.

There are a few different ways to fix this. One is where the volume checks to make sure it only sends an add request on the first relevant actor: image

Another is to add a safeguard in USpudSubsystem::AddRequestForStreamingLevel() that makes sure that while there is only 1 requester, this call didn't happen as a result of the same requester calling it and not being added to Request.Requesters because it's not unique: image

There's also the code I posted in the comment above in LoadStreamLevel() that checks to see if a level is already loaded and early exits if it is.

Being that in this architecture, the ASpudStreamingVolume is a consumer of USpudSubsystem, the safeguards should exist in the latter since other classes could potentially use its functionality and could call the wrong function more than once like the volume is currently doing, which would cause this same ensure to be hit. And since LoadStreamLevel() is protected and not exposed to blueprints, we shouldn't need to worry about that being called incorrectly (although the fix in my previous comment would guarantee a streaming level doesn't get loaded twice). So, maybe the check in the second option of this comment is the way to go? (Or maybe both fixes in this comment because the existing logic in ASpudStreamingVolume would still technically be incorrect? That is, as it stands, it will call AddRequest multiple times, but only call WithdrawRequest once.)

Sorry for the wall of text. :P Also, again, happy to send any/all of these fixes to you in a PR. :)

sinbad commented 2 years ago

OK I see the issue. I missed the fact that due to uniqueness, using 1 as a sentinel wasn't going to work. Instead flipping that to "increasing from zero" is the simplest correct version. The actors in volume check wasn't really hurting since uniqueness was already enforced in the subsystem (once fixed) but I've done it anyway.

https://github.com/sinbad/SPUD/commit/16613d94a0b9daa8a967ee0984609cb2e9c8554f

TI-asilva commented 2 years ago

Awesome, thank you!