derkork / godot-statecharts

A state charts extension for Godot 4
MIT License
673 stars 33 forks source link

getting unreproducible initialization failures #93

Open BasHamer opened 3 months ago

BasHamer commented 3 months ago

when it fails then it fails for all initializations.

below is my Godot setup and the code. When this condition occurs I find that the mob never enters Move State. None of the mobs do. When the bug is not happening everything works great. I've been chasing this thing down for a while now trying to figure out (6+ weeks) where the race condition could be and I'm getting pretty confident that it is in the Compound State. When looking in the Debugger (C#) I have a hard time telling what is going on as the StateChart and Compound State show as Node and I can't figure out how to view the Compound State state.

image

`public override void _Ready() { ... StateChart = StateChart.Of(GetNode("StateChart")); ... } internal void _on_spawn_state_entered() { AnimationPlayer.Play("spawn"); }

private void P_AnimationFinished(StringName animName)
{
    if (animName == "spawn")
    {
        StateChart.SendEvent("ToMove");
    }
   ...
}

internal void _on_move_state_entered()
{
    ...
}

internal void _on_move_state_processing(float delta)
{
    ...
}

`

video should be ready in 40 min https://youtu.be/beAuaZE9S_g (unlisted)

derkork commented 3 months ago

Thanks a lot for going through the trouble of making a video! So if I get this correctly, the problem is that for some reason the move state is not entered consistently all of the time. So now I see two ways that this could go wrong:

  1. There is some bug in the library that randomly occurs. In this case we should see an event being sent to the state chart but no transition would result from this.
  2. The library works correctly but the event that should trigger the transition is not sent to the state chart - hence no transition happens.

In either case the solution revolves around the appropriate event being sent so a state change can happen. So as a first step I think we'll need to check that the event is indeed sent at the right time. As you have pointed out there isn't a lot to see in the C# debugger. This is because the C# scripts are just a very thin wrapper around the core library which is written in GDScript. So all the "magic" happens in GDScript land and is therefore invisible to the C# debugger (arguably there isn't really a lot to see on the GDScript side as well). Nevertheless, we have some debugging aids that can help with this.

  1. We can use the built-in state chart debugger to see what's going on. Every state chart has a "track in editor" flag. If we turn this on, we can see the chart in the editor while the game is running:

image

There is a history panel showing what events were received and how the state chart reacted to this.

image

  1. We can drag a state chart debugger directly into the scene and connect this to the state chart. This gives us a similar view than the in-editor debugger but right in the game:

image image image

  1. We can listen to the event_received signal of the StateChart and print whenever we receive an event in code. This can help us see whether we get the event in the first place:

image

and then:


    // this is for debugging purposes so we can see when some event is received
    private void OnEventReceived(StringName evt)
    {
        GD.Print("Event received: " + evt);
    }

I have created a small demo project showing all these options if you want to see it in action:

state_chart_debugging.zip

I think with this we should be able to see what events go into the state chart and how it reacts to these and this will hopefully help in finding the root cause of the problem and coming up with a fix.

BasHamer commented 3 months ago

Thanks for the response. I think that it might be simpler as in that the state chart never entered the default state of the compound state. No default state, no animation gets started and nothing else happens. It is not consistent between starts of the application but it is consistent for every run of the application. meaning either it works for hundredths (thousands) of state charts, or it works for none.

had to rig it a little different bugged state

image

good state

image

no code change between the two

for reference, I got it to display like a healthbar. Using a SubViewport and a Sprite3d set to billboard. image

derkork commented 3 months ago

These screenshots are very interesting. I checked the initialization code of the state chart which looks like this:

func _ready() -> void:
    if Engine.is_editor_hint():
        return 

    # check if we have exactly one child that is a state
    if get_child_count() != 1:
        push_error("StateChart must have exactly one child")
        return

    # check if the child is a state
    var child:Node = get_child(0)
    if not child is State:
        push_error("StateMachine's child must be a State")
        return

    # initialize the state machine
    _state = child as State
    _state._state_init()

    # enter the state
    _state._state_enter.call_deferred()

    # if we are in an editor build and this chart should be tracked 
    # by the debugger, create a debugger remote
    if track_in_editor and OS.has_feature("editor"):
        _debugger_remote = DebuggerRemote.new(self)

So you can see that the only way for this to exit without printing an error would be if the state chart runs in the editor. Everything else prints an error message. But who knows, so I added a hard return directly at the start, so that not even the initialization code would run. This broke the state chart of course, but I saw a crucial difference between your screenshot of the state chart debugger and mine:

image

image

Mine shows at least the root node of the state chart, while yours shows nothing, not even the root node. Which leads me to believe that the state chart isn't even in the tree in the first place or it is deleted really quickly before anything visible can be done. Is there maybe a way through your logic that would trigger a deletion of the whole thing right at the beginning? Maybe you could add a breakpoint/print statements to the lines of code that QueueFree things in your game to see if they are somehow hit?

BasHamer commented 3 months ago

It is not a race condition at the mob level. When this happens every mob has the same issue. not even every mob of the same scene type, every single one regardless of the scene code.

My only QueueFree is in the Animation Finished and requires the die animation to play. And that in turn is also tied to the state chart so it is impossible to die.

MortarBugView - Copy.tscn.txt MortarBugView - Copy.cs.txt

Is there code that runs at the registration of the addon level before the instantiations? because it looks like it never got to the code that you shared.

I do not have other Godot plugins; I reinstalled statechart 3/16 (unsure how to tell the version)

I have 2 3rd party packages

Godot 4.2.1

this is how the scene is instantiated

MortarBug = GD.Load<PackedScene>(Resources.MortarBugView);

else if (bug is MortarBugController mb) { var inst = MortarBug.Instantiate<global::MortarBugView>(); inst.Bug = mb; var l = MapManager.GetEmptySpaceNear(location); inst.Position = l; return inst.One(); }

and then called like this in the root scene.

var mb = RunVariables.EnemyFactory.Create(new DiscreteVector3(15, 0, 15), RunVariables.Container.GetInstance<LandTrains.scripts.Enemy.MortarBugController>()); AddChild(mb.First());

or it may spend some time on a queue before being added to the root scene.

going to see if I can reproduce it on an other machine and/or w/o visual studio.

BasHamer commented 3 months ago

btw happy to get you source code access, but I figure that I'd first reproduce it on an other machine.

BasHamer commented 3 months ago

I think this is the behavior when it shows up inside Godot

image

derkork commented 3 months ago

This last screenshot doesn't look healthy. Do you have by any chance a [GlobalClass] annotated class that might clash with one of the class names used by the addon (there are unfortunately no namespaces in GDScript)?

BasHamer commented 3 months ago

No hits on [GlobalClass] All hits for "State" are references to the Addon or the GameStateManager (c# class), the issue predates that class by weeks. All hits for "Transition" are in the addon

derkork commented 3 months ago

Hmm it's really hard to say. I have a feeling this is something very small and simple but without seeing the full project or a reproducer it's really hard to investigate this.

BasHamer commented 3 months ago

https://github.com/BasHamer/LAND-TRAINS changed it to public. There are no keys in there, if it gets keys I'll change it to private.

It is currently set in a test mode (on scene Main property TargetDummy = true ), don't click the red door :) that will crash it.

I'm in MST, happy to make time to get it up and running, just send me some time options.

derkork commented 3 months ago

So I have it checked out locally and after fixing some problems with the main script (the camera wasn't working correctly and I got a lot of exceptions) I finally got it running. Thing is, though I cannot even reproduce it locally with this one. I ran it 30 times in a row and it worked 30 times in a row. So this positively screams "race condition" - which are the hardest to debug. I just need to find out where. I'm digging a bit deeper into the codebase to find out more. Also which exact version of Godot are you using?

derkork commented 3 months ago

Hmm it's all pretty standard instantiate / add_node I can't really see any way how this could produce a race condition. The only thing that could potentially mess up things is the usage of System.Timer as its callbacks run on a different thread but these are only used for targeting and cleanup so I don't see how they could mess with the state charts. Did you ever get any error message/exception when this sort of problem happened?

BasHamer commented 3 months ago

Godot_v4.2.1-stable_mono_win64 and yeah it is not consistent. This morning I'm not having much luck w/ triggering it, it took 40 ish minutes, and 10 of that was regular dev time vs hitting reload..

sorry about the exceptions, I missed a lot of those as they went to console, working on getting the console log clean. I'm glad it wasn't too bad to get running (no idea on the camera issue but I'm sure I'll encounter it once I try and get a playable build to friends.)

'Godot_v4.2.1-stable_mono_win64.exe' (CoreCLR: clr_libhost): Loaded 'C:\Program Files\dotnet\shared\Microsoft.NETCore.App\7.0.5\System.ComponentModel.TypeConverter.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled. 'Godot_v4.2.1-stable_mono_win64.exe' (Win32): Loaded 'C:\Program Files\dotnet\shared\Microsoft.NETCore.App\7.0.5\System.ComponentModel.Primitives.dll'. 'Godot_v4.2.1-stable_mono_win64.exe' (CoreCLR: clr_libhost): Loaded 'C:\Program Files\dotnet\shared\Microsoft.NETCore.App\7.0.5\System.ComponentModel.Primitives.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled. res://addons/godot_state_charts/state_chart.gd:28 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/state_chart.gd:141 - Parse Error: Could not find type "Transition" in the current scope. res://addons/godot_state_charts/state_chart.gd:141 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/state_chart.gd:165 - Parse Error: Could not find type "Transition" in the current scope. res://addons/godot_state_charts/state_chart.gd:165 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/state_chart.gd:174 - Parse Error: Could not find type "Transition" in the current scope. res://addons/godot_state_charts/state_chart.gd:174 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/state_chart.gd:63 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/state_chart.gd:68 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/state_chart.gd:77 - Parse Error: Could not resolve member "_init". res://addons/godot_state_charts/state_chart.gd:197 - Parse Error: Could not find type "State" in the current scope. modules/gdscript/gdscript.cpp:2788 - Failed to load script "res://addons/godot_state_charts/state_chart.gd" with error "Parse error". (User) res://addons/godot_state_charts/compound_state.gd:6 - Parse Error: Could not find base class "State". modules/gdscript/gdscript.cpp:2788 - Failed to load script "res://addons/godot_state_charts/compound_state.gd" with error "Parse error". (User) res://addons/godot_state_charts/atomic_state.gd:5 - Parse Error: Could not find base class "State". modules/gdscript/gdscript.cpp:2788 - Failed to load script "res://addons/godot_state_charts/atomic_state.gd" with error "Parse error". (User) res://addons/godot_state_charts/transition.gd:13 - Parse Error: Invalid argument 1 of annotation "@export_node_path": The class "State" was not found in the global scope. res://addons/godot_state_charts/transition.gd:27 - Parse Error: Could not find type "Guard" in the current scope. res://addons/godot_state_charts/transition.gd:27 - Parse Error: Cannot use simple "@export" annotation because the type of the initialized value can't be inferred. res://addons/godot_state_charts/transition.gd:61 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/transition.gd:53 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/transition.gd:66 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/transition.gd:82 - Parse Error: Could not find type "State" in the current scope. modules/gdscript/gdscript.cpp:2788 - Failed to load script "res://addons/godot_state_charts/transition.gd" with error "Parse error". (User) res://addons/godot_state_charts/utilities/state_chart_debugger.gd:42 - Parse Error: Could not find type "StateChart" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:46 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:169 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:234 - Parse Error: Could not find type "Transition" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:234 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:248 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:255 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:115 - Parse Error: Could not find type "StateChart" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:165 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:176 - Parse Error: Could not find type "State" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:178 - Parse Error: Could not find type "Transition" in the current scope. res://addons/godot_state_charts/utilities/state_chart_debugger.gd:218 - Parse Error: Could not find type "State" in the current scope. modules/gdscript/gdscript.cpp:2788 - Failed to load script "res://addons/godot_state_charts/utilities/state_chart_debugger.gd" with error "Parse error". (User) 'Godot_v4.2.1-stable_mono_win64.exe' (CoreCLR: clr_libhost): Loaded 'FastExpressionCompiler'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled. 'Godot_v4.2.1-stable_mono_win64.exe' (Win32): Loaded 'C:\Program Files\dotnet\shared\Microsoft.NETCore.App\7.0.5\System.Reflection.Emit.ILGeneration.dll'.

BasHamer commented 3 months ago

failed load.txt good load.txt

a few notes; the lines that say "Module is optimized and the debugger option 'Just My Code' is enabled." this is a C# exception setting that is supposed to show only exceptions I can debug, but in 3rd party libraries. It is a visual studio thing and also not working quite right with Godot. Xsplit is recording software. Overwolf is an addon for a game that I was surprised to see in here and is getting kicked off my system now.

I can find no info on

Exception thrown at 0x00007FF8665A5B0C in Godot_v4.2.1-stable_mono_win64.exe: Microsoft C++ exception: CommonUtil::CHResultExceptionImpl at memory location 0x0000009FDEDFA070.

derkork commented 3 months ago

The error output doesn't really make sense. It says it cannot find the State class but indicates no error in state.gd itself. The rest are just errors based on the missing State classs. I have seen spurious errors like this in Godot before (also outside of state charts) (.e.g https://github.com/godotengine/godot/issues/78642, https://github.com/godotengine/godot/issues/82392). What usually helps is deleting the .godot folder. This means however that you need to reimport the project. I tried to reproduce this for an hour now, still no luck. Maybe it's because I have a fresh .godot folder.

BasHamer commented 3 months ago

ok thanks for all your help; I'll be back if I can find more information. I created an export and it ran.

I deleted the folder and will look after work. It is likely that I ended up with weird stuffs as this is the first Godot project I have created. I'll report back if it happens again or if it disappeared.

BasHamer commented 3 months ago

24 hours w/o the issue, tempting fate by writing this intentionally.

BasHamer commented 3 months ago

yeah still happening :/ going to try and see if my laptop can reproduce it. if not then I'm going to ignore it until I get bug reports from users.

** reproduced it on my laptop (also reproduced the camera issue you mentioned). Still going to ignore it as it is quick to catch and I don't know if it will happen in the exports.

Thanks again for all the help

derkork commented 3 months ago

Thanks for reporting back! Well that still isn't nice but given the error messages we've seen and the fact that it can be fixed by cleaning the .godot folder I think it's an issue with the engine itself. There is not really something I can do if the engine refuses to load my scripts in the first place. I wish we could somehow successfully reproduce this but I found an issue in the Godot issue tracker that describes the behaviour we have seen and also seems to be fixable by deleting the .godot folder: https://github.com/godotengine/godot/issues/83561