godotengine / godot-cpp

C++ bindings for the Godot script API
MIT License
1.74k stars 576 forks source link

Incorrect _Wrapped::_owner initialization #475

Open o01eg opened 3 years ago

o01eg commented 3 years ago

I have a GDNative class extending Translation. When I call _new I get pointer to it where _owner field initialized to incorrect memory address:

    GodotI18n *i18n = GodotI18n::_new();
(gdb) p i18n
$17 = (godot::GodotI18n *) 0xe4414b0
(gdb) p i18n->_owner
$18 = (godot_object *) 0xe441

and next it fails in reference counter

Thread 1 "godot.x11.tools" received signal SIGSEGV, Segmentation fault.
0x0000000003a0f398 in atomic_conditional_increment<unsigned int> (pw=0xe539) at ./core/safe_refcount.h:107
107         T tmp = static_cast<T const volatile &>(*pw);
(gdb) bt
#0  0x0000000003a0f398 in atomic_conditional_increment<unsigned int> (pw=0xe539) at ./core/safe_refcount.h:107
#1  SafeRefCount::refval (this=0xe539) at ./core/safe_refcount.h:187
#2  Reference::reference (this=0xe441) at core/reference.cpp:63
#3  0x000000000191c360 in MethodBind0R<bool>::ptrcall (this=0x5663570, p_object=0xe441, p_args=0x7fffffffc3b0, r_ret=0x7fffffffc3af) at ./core/method_bind.gen.inc:244
#4  0x0000000001942d2b in godot_method_bind_ptrcall (p_method_bind=0x5663570, p_instance=0xe441, p_args=0x7fffffffc3b0, p_ret=0x7fffffffc3af) at modules/gdnative/gdnative/gdnative.cpp:69
#5  0x00007fffe2dfb2e8 in godot::___godot_icall_bool (inst=0xe4414b0, mb=<optimized out>) at /mnt/another/srcs/GIT/freeorion/godot/godot-cpp/include/gen/__icalls.hpp:7378
#6  godot::Reference::reference (this=this@entry=0xe4414b0) at /mnt/another/srcs/GIT/freeorion/godot/godot-cpp/src/gen/Reference.cpp:34
#7  0x00007fffe2dd422d in godot::Ref<godot::Translation>::ref (p_from=..., this=0x7fffffffc4a0) at /mnt/another/srcs/GIT/freeorion/godot/godot-cpp/include/core/Ref.hpp:123
#8  godot::Ref<godot::Translation>::Ref (p_from=..., this=0x7fffffffc4a0) at /mnt/another/srcs/GIT/freeorion/godot/godot-cpp/include/core/Ref.hpp:126
#9  godot::GDFreeOrion::_init (this=this@entry=0xecc5050) at /mnt/another/srcs/GIT/freeorion/godot/gdfreeorion.cpp:200
#10 0x00007fffe2dd177b in godot::_godot_class_instance_func<godot::GDFreeOrion> (p=0xdf60a30, method_data=<optimized out>) at /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include/g++-v9/typeinfo:143
#11 0x0000000001947826 in NativeScript::instance_create (this=0xe4c08f0, p_this=0xdf60a30) at modules/gdnative/nativescript/nativescript.cpp:220
#12 0x00000000039e5794 in Object::set_script (this=0xdf60a30, p_script=...) at core/object.cpp:1030
#13 0x00000000039e258b in Object::set (this=0xdf60a30, p_name=..., p_value=..., r_valid=0x7fffffffc9e8) at core/object.cpp:433
#14 0x0000000003386c77 in SceneState::instance (this=0xeb645d0, p_edit_state=SceneState::GEN_EDIT_STATE_DISABLED) at scene/resources/packed_scene.cpp:212
#15 0x0000000003391e5b in PackedScene::instance (this=0xe25f370, p_edit_state=PackedScene::GEN_EDIT_STATE_DISABLED) at scene/resources/packed_scene.cpp:1698
#16 0x000000000145c135 in Main::start () at main/main.cpp:1942
#17 0x000000000141a4d0 in main (argc=1, argv=0x7fffffffd5c8) at platform/x11/godot_x11.cpp:55

OS: Gentoo Linux AMD64 Version: 3.2

Zylann commented 3 years ago

(gdb) p i18n $17 = (godot::GodotI18n ) 0xe4414b0 (gdb) p i18n->_owner $18 = (godot_object ) 0xe441

In your snippet, i18n and i18n->_owner are two different things, so it's normal they print different adresses. Although _owner's adress is suspiciously short.

I don't know about the crash though. It should have been fine. Godot's recent version broke something about references maybe?

Since Translation is a Reference, did you try this instead?

Ref<GodotI18n> i18n;
i18n.instance();
o01eg commented 3 years ago

In your snippet, i18n and i18n->_owner are two different things, so it's normal they print different adresses. Although _owner's adress is suspiciously short.

Yes, it looks like somehow get (i18n >> 32) when jumping from GDNative to Godot, then fails with segfault.

instance crashes also with same backtrace.

Zylann commented 3 years ago

What is the full code producing the crash? Is it enough to just instance and not call any methods on it?

Version: 3.2

Which precise Godot version?

Which version of the bindings are you using?

I wasn't able to reproduce it image (after the end of _ready it just goes on drawing a green circle and chillin' until I close the window

o01eg commented 3 years ago

3.2 is a branch of godot-cpp and Godot itself is 3.2.3.

Could it be reproduced with _new method?

Zylann commented 3 years ago

instance() just uses _new() too so in fact there should be no difference during creation.

3.2 is fairly old though (11 months), there was a spree of fixes since. Try with master, there is a big chance your problem went away. Note, master should still be fine to use with Godot 3.2.3.

colugomusic commented 3 years ago

I am able to produce this exact crash with a debug build of Godot 3.2 and godot-cpp master. The concerning thing is that my project contains three separate GDNative plugins, all built the same way, all with classes written the same way (inheriting from godot::Reference), all being instantiated in the same way (godot::Ref<T>(T::_new()))

One of the plugins crashes and the other two do not so I have no idea what's going on.

The actual problem appears to occur in method_bind.gen.inc in the call to ptrcall.

#ifdef PTRCALL_ENABLED
    virtual void ptrcall(Object*p_object,const void** p_args,void *r_ret) {

        T *instance=Object::cast_to<T>(p_object);
        PtrToArg<R>::encode(  (instance->*method)() ,r_ret) ;
    }
#endif

Object::cast_to returns nullptr here because the dynamic cast fails for some reason.

When I try with Godot 3.2.3 stable it doesn't crash but it still doesn't work. T::_new() appears to produce a valid object but then Ref::ref() returns nullptr.

Zylann commented 3 years ago

Are your 3 different libraries interacting with each other? The snippet you posted comes from Godot itself. Sure the cast returns nullptr, but was p_object nullptr in the first place? And what was it trying to cast it to? Was the debugger indicating a compatible type? Or garbage?

You could be having that crash for a different reason though. You might need to report a different issue and describe reproduction steps, or better, a full reproduction project, because the only reproduction steps I have don't produce any crash. The original report of the present issue can't be reproduced so I think this one was fixed in the recent months.

colugomusic commented 3 years ago

'p_object' is not 'nullptr'. The stack trace is exactly the same as the one o01eg posted so it seemed unlikely to be an unrelated crash, but you might be right. If the heap is getting corrupted due to some other bug then the stack trace could be misleading.

colugomusic commented 3 years ago

Okay found the issue. My class's _init() method instantiates a PackedScene and calls set() on it to set a property that does not actually exist. Godot does not produce any error or indication that anything went wrong but I don't know if it is handling the situation very gracefully. Removing the call to set(), or simply adding the missing property, resolves the later crash in Ref::ref(). Might be worth investigating what is actually happening in set() when the property doesn't exist since the stack trace is so misleading.

If I have time then I will try to create a reproduction project for this

Zylann commented 3 years ago

In that case maybe you'd have to test this in GDScript, it doesnt appear to be related to GDNative then. Maybe one last detail could be about when you actually build that PackedScene, but I can't think of one case yet.

colugomusic commented 3 years ago

I'm not really sure what the equivalent GDScript would look like, but this is not producing any crash or error:

res://test.gd:

extends Reference

func _init():
    var scene:PackedScene = load("res://test.tscn")

    var inst = scene.instance()

    inst.set("blahblahblah", 1)
var test = load("res://test.gd")

test.new()

I don't understand your question about building the PackedScene. It's just being loaded using ResourceLoader::get_singleton()->load(...)

Zylann commented 3 years ago

The question is when all of this runs, from the start. From _ready? from the registration callback of your library? From global space?

In GDScript the equivalent is as you wrote. However I thought you were talking about the set on PackedScene, instead here you're doing it on the instance produced by scene.instance(), and that instance could be any kind of node with or without a script on it, so there is that... Still hard to pinpoint without a full reproduction project.

colugomusic commented 3 years ago

Reproduction project:

bug.zip

Consistently crashes for me. I was going to create a new issue but I don't know which repository it should be under

Zylann commented 3 years ago

Your BugDemo extends Reference:

class BugDemo : public Reference

Yet you put it on a node, which is a completely unrelated class. Honestly I wonder how Godot allowed you to do this, but this is already an invalid situation.


Also, I see you used set with this:

        node->set("a_property_that_does_not_exist", this);

I'm not sure if this will behave well. Ideally I would expect the conversion to Variant to properly grab a reference to the Thing, and then be discarded somewhere in Godot since the property does not exist.

However here is the thing: since you do this in the constructor by passing a naked pointer, the refcount will be initialized by this set then, because set goes throught Variant initialization. And because Godot might not keep it due to the property not existing, it will see it has only one ref, and... destroy it. Your Thing will then be dead right after being constructed. On the other hand, if that property existed, the reference would have been kept intact.

Without having run anything yet, I suspect this is the problem, and I don't think there is a fix for this, cuz that's just how references work. I could be wrong though.


        // This line might crash or might not. I think it depends on if you're running a debug build of Godot.
        Ref<Thing> ref(object);

I've been testing this before and it never crashed. Maybe Godot misbehaved during the process of building Thing?


        // Should always crash because ref will be null
        ref->get_class();

It shouldn't though, since the type is expected to match. Also obj->method() won't always crash in C++, it will most often crash if method is accessing this. Behavior varies depending on compiler.


Might matter when you export your game, in BugDemo.gdns:

[ext_resource path="res:///bug/bin/bug.gdnlib" type="GDNativeLibrary" id=1]

You have 3 / in the path. This will likely break when loaded from an exported PCK. There should be only 2 / after res.

Still looking into it.

colugomusic commented 3 years ago

Your BugDemo extends Reference:

class BugDemo : public Reference

Yet you put it on a node, which is a completely unrelated class. Honestly I wonder how Godot allowed you to do this, but this is already an invalid situation.

Woops this just a typo in the demo. In my original project I am extending Node.

Also, I see you used set with this:

      node->set("a_property_that_does_not_exist", this);

I'm not sure if this will behave well. Ideally I would expect the conversion to Variant to properly grab a reference to the Thing, and then be discarded somewhere in Godot since the property does not exist.

Passing something other than this does stop the crash so this seems like the issue. If the property does exist then passing this seems to work fine.

Zylann commented 3 years ago

I expanded my reasoning about that part in the message above. I didnt run anything yet, but I think this is your issue.

You might be able to reproduce it like that:

    void _init()
    {
        Ref<Thing> test(this);
    }
colugomusic commented 3 years ago

I expanded my reasoning about that part in the message above. I didnt run anything yet, but I think this is your issue.

Thank you your reasoning does make sense. I am surprised that Godot does not produce an error message in the case of a non-existent property being set.

You might be able to reproduce it like that:

  void _init()
  {
      Ref<Thing> test(this);
  }

Yes this reproduces the crash.

Zylann commented 3 years ago

FYI, I confirm this even reproduces directly in Godot's code: image

Passing a naked pointer to ref to a Godot function which doesn't take ownership of it is a no-go if you didn't take ownership yourself. And you can't do that from a C++ constructor because it would always prevent refcount initialization to properly finish before something else takes ownership. Even if Godot raised an error about the set being invalid, it would carry on as usual and your crash would still happen. The result is a dangling pointer to a deleted object. Depending on your debugger the contents of the object might be filled with a pattern like 0xfefefefefefefe or 0xbaadf00d.

In GDScript this doesn't happen because this is what it gets inerpreted into instead:

image

Observe the difference of order in the printouts. Due to how GDScript works, when _init is called, something else already holds an initialized reference count. While in C++ this can't happen without moving the set logic outside the constructor.

If you desperately need it, you might be able to do this horror:

    void _init()
    {
        // Initialize refcount
        Ref<Thing> test(this);
        // Add an artificial reference
        this->reference();

        // Pass it to whatever Godot things...
    }

    // and then somewhere outside:
    // remove the artificial reference
    thing->unreference()

Otherwise the behavior is as intented.

colugomusic commented 3 years ago

I understand thanks a lot for the help. This process of initialization is a useful idiom in my situation but if I was writing it in pure C++ then the delicacy of passing this from the constructor would have been more obvious.