godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
89.24k stars 20.23k forks source link

Proposal for changes in GDNative for Godot 4.0 #35467

Closed touilleMan closed 1 year ago

touilleMan commented 4 years ago

While working on Godot-Python, I've encountered my share of quirks in the GDNative API, I guess it's time to fix them for Godot 4.0 ;-)

Issues in api.json:

Issues in gdnative api:

Issues in ptrcall:

That's a big wishlist I guess ! I guess I've already found what I'll be working on at next week's GodotCon sprints ^^

touilleMan commented 4 years ago

related issues:

touilleMan commented 4 years ago

another related PR #35795 (Fix return type for GDNative's godot_color_to_XXX64)

karroffel commented 4 years ago
raniejade commented 4 years ago

Push more code about POD core types (such as Vector2/3, Color, Basis) into bindings. It seems like most bindings do not use many other functions other than the "construct" and "deconstruct" functions. For example the Rust bindings receive those values, then convert them to types from the euclid crate and then do all operations on that. So all "accessor" functions are unused. Similar for the C++ bindings.

Those types would be defined as actual structs with proper members, for example

struct godot_vector3 {
    float x;
    float y;
    float z;
};

Functions should never return those complex values directly but always by pointer. Languages like Haskell have troubles with functions returning unboxed data.

Wouldn't this create API disparity between the native bindings and GDScript (core godot)? Going this route means will just create duplicate code across the bindings. Having a consistent API in terms of the interface and behaviour is pretty advantageous, specially for users not wanting to use GDScript. That's why my binding (https://github.com/raniejade/godot-kotlin) does not re-implement any of the core types, it just delegates everything to Godot. Users can always refer to Godot's official documentation regardless of what binding they are using.

raniejade commented 4 years ago

Pulling this in from the godot_headers repo. https://github.com/GodotNativeTools/godot_headers/issues/65

karroffel commented 4 years ago

Yes, but generally there are idiomatic ways to handle linear algebra types in most languages. So it might be better to use common libraries that will work well with other libraries for example, which is what Rust does.

When new functions are added to those types the GDNative API has to change. It takes some care to not break binary compatibility and also makes it harder for bindings to just correctly add new functionality.

Also by going through the C API you have some unavoidable overhead, as no LTO or cross-language inlining works with runtime provided function pointers. For things that should be as easy as a simple add or mov you now need to convert arguments and return values from a C API.

Of course, the behavior being the same as with Godot is useful, but this is not a technical problem but an ecosystem/idiomatic one. We could keep those functions, but I think in environments that already have more-or-less common ways to do those operations going through a C API with custom types is not very nice.

karroffel commented 4 years ago
raniejade commented 4 years ago

I totally agree with the technical side of things, however, the ecosystem aspect is something that should not be dismissed as it could hurt the adoption of Godot. Learning resources available about Godot is quite sparse (relative to Unity and Unreal), transferability of those resources to other bindings is useful for people who prefers not to use GDScript (for various reasons).

raniejade commented 4 years ago
ghost commented 4 years ago

39588 is another case of return value initialization inconsistency. In this case there is no way for API consumers to handle methods that return godot_array consistently without leaking memory (initializing pointers even when they're overwritten).

Zylann commented 4 years ago

In NativeScript:

@karroffel

make all object references be ObjectIDs and not raw pointers. This includes all calls to ptrcall or varcall

Why? Is it for safety? Wouldn't this incur a performance overhead? I have a library which uses set_pixel and get_pixel a lot on images. If I chose to use C++, I don't expect to have such indirection baked in every call.

Push more code about POD core types (such as Vector2/3, Color, Basis) into bindings.

I've seen that. In C++ I find there is no point to use method pointers to do linear algebra, other than having the exact same behavior, but the overhead becomes quite stupid. If the Godot math library can be copied around more easily, it's easier. For C++ only though, other languages may have to rewrite their own, or use other math libraries as long as they allow the same conventions as Godot

karroffel commented 4 years ago

Why? Is it for safety? Wouldn't this incur a performance overhead?

Yes and yes. In the Rust bindings we want to model the API is correctly as possible while making it as safe as possible too. One problem with that are non-reference counted types. If you know you hold a reference to an object (and the rest of the world doesn't just run around doing .unreference()) then you know it's going to be okay to call methods on an object that's reference counted.

For Rust that means that it's considered safe to call methods on reference counted objects. With non-reference counted objects anybody could .free() at any point from a different thread and then the object might still be valid while the Rust bindings are doing the call, but when it enters Godot (or anywhere in between really) it might suddenly be freed. At that point this call will ideally cause a crash or possibly even worse things when an ABA problem occurs.

One way to get rid of this is to not use pointers directly but use IDs. That way the engine could make sure that the object is safe to call even when it passed the language-binding border and in an atomic way. Even if Godot just says "This call was invalid" it's still better than the current situation where things can just go horribly wrong.

touilleMan commented 4 years ago

make all object references be ObjectIDs and not raw pointers. This includes all calls to ptrcall or varcall

@karroffel Have you created an issue about this ? I feel like this proposition is far from trivial so better have a dedicated consultation about it ;-)

I guess this change would impact the whole engine (and not only gdnative) given it can benefit from anywhere (especially GDscript !), so I'm curious about the performance impact.

On top of that I wonder if leaving the old raw pointer stuff for GDNative as an alternative would be worth it (it should be fairly simple given all the code is already written, and it would be up to the GDNative module creator to bear the responsibility of the segfaults that could occur ^^), however having one way to do thing and keeping people avoid from their overconfidence of dodging segfault is a good thing ! So I guess it all really depend on the impact on performances.

One final though: this check-on-access-before-use behavior seems to be the exact opposite of the refcounting check-on-unref-if-delete-is-needed behavior. So wouldn't it be just simpler (and faster given unreference occurs less often than regular access) to make all object in Godot refcounted ?

karroffel commented 4 years ago

@touilleMan

Have you created an issue about this ? I feel like this proposition is far from trivial so better have a dedicated consultation about it ;-)

It is indeed a non trivial discussion, but I have not created an issue about it yet, but maybe that's a good idea. This idea of using Object IDs for the current ptrcalls was something proposed by @reduz on IRC a while ago.

So wouldn't it be just simpler [..] to make all object in Godot refcounted?

That would be my preferred solution too, but @reduz expressed that this is not something he wants to do. Maybe a dedicated issue for the safety of GDNative calls might be better suited for this discussion?

aaronfranke commented 4 years ago
  • ((1, 0), (0, 1), (0, 0)) (should be Vector3((1, 0), (0, 1), (0, 0)))

No, each element of a Vector3 is a floating-point number, not two of them, and you are missing type information for the inner items. I think you meant it should be Transform2D(Vector2(1, 0), Vector2(0, 1), Vector2(0, 0))

Zylann commented 4 years ago
hnOsmium0001 commented 4 years ago

get_script() returns Reference* instead of something like Script* or Ref<Script>. Is this intended? And it seems like the returned Reference* is either a proper Script* or nullptr, so it really should return Script* in my opinion.

Zylann commented 4 years ago

@hnOsmium0001 from my experience of this oddity, it seems to be the case because if it was wrapped in Ref<> it woud cause a cyclic reference. The declaration of Object would depend on Reference, but that can't happen because Reference depends on Object. It's the case in Godot's source code as well. C++ bindings also failed to compile at some point because of this, which forced us to make sure Object::get_script() returns Reference* instead. It's not really related to GDNative itself, though.

At least, in the GDNative API, maybe it could return Script.

hnOsmium0001 commented 4 years ago

~~There is also godotengine/godot-cpp#431, which should be a really simple fix of changing https://github.com/godotengine/godot-cpp/blob/master/include/core/Godot.hpp#L469 to~~

void register_signal(String name, Dictionary args)

Sorry I didn't see this issue is only for GDNative.

Zylann commented 4 years ago

@hnOsmium0001 if issues are only related to the C++ bindings, please keep them in godot_cpp instead, not here. This issue is about the GDNative API itself.

ghost commented 3 years ago

A version of ptrcall that allows omission of optional arguments

This should help improve the compatibility between compiled binaries and future minor/patch versions of the engine, since new optional arguments can get added in minor/patch versions. Currently, it's UB to ptrcall a method that has been changed in this way, since the engine assumes that the full argument list has been provided and would dereference invalid memory.

Allow NativeScript constructors to take arguments

I'm not sure why they weren't passed to the library in the first place, but this is a good chance to change that.

Zylann commented 3 years ago

Allow NativeScript constructors to take arguments

It's possible to create a new instance from a NativeScript by specifying arguments, but the problem is, the current method does not allow us to get a direct pointer and causes an issue with refcounting. I mentionned it in second point here https://github.com/godotengine/godot/issues/35467#issuecomment-674431018 . Next point also expands on other related issues. I don't know about the other way around though (i.e registering a class with constructor arguments)

ghost commented 3 years ago

It's possible to create a new instance from a NativeScript by specifying arguments ... I don't know about the other way around though (i.e registering a class with constructor arguments)

While you can pass arguments to NativeScript::_new, it can't actually do anything with them, since the current signature of godot_nativescript_instance_create_func::create_func doesn't take any arguments beyond a pointer to the owner object and a pointer to bindings-specific data. I was suggesting that the signature of create_func be changed so the argument list can actually be passed to the constructors.

This is because the current way initializes refcount, which makes it hard to support consistent Ref in the C++ bindings, so we had to use set_script instead,

Interesting, because we have recently switched from set_script to NativeScript::new in Rust (since the latter works when called from tool scripts), but no leaks or use-after-frees were noticed. Perhaps this could be an indication of some issue in godot-cpp regarding Variant handling in general instead?

Zylann commented 3 years ago

Interesting, because we have recently switched from set_script to NativeScript::new in Rust (since the latter works when called from tool scripts), but no leaks or use-after-frees were noticed. Perhaps this could be an indication of some issue in godot-cpp regarding Variant handling in general instead?

That's strange because from all the research we did, the problem really was the fact Godot initializes the refcount on its side because it returns us a Variant. See https://github.com/godotengine/godot-cpp/blob/c9a740be34438ce999402b7b76304a38daaaa811/include/core/Godot.hpp#L39 (there is Variant in the code here but that's because new is not even a GDNative function, it's the Godot script API returning us that, not a choice from us) Maybe Rust exposes that differently than we do. Could try to investigate again but that's what I remember. The idea was then to have a GDNative-specific function that gives us the pointer directly so we can just use a verbatim copy of the behavior found in Godot, without the need to somehow differenciate between various origins of that pointer.

touilleMan commented 3 years ago

A version of ptrcall that allows omission of optional arguments

This should help improve the compatibility between compiled binaries and future minor/patch versions of the engine, since new optional arguments can get added in minor/patch versions.

@toasteater that's an interesting idea

if we have godot_foo(int a) in Godot 4.0, then godot_foo(int a, int b) in Godot 4.1

I personally think the key point here is to provide a consistent and clear rule on GDNative compatibility (GDNative and Godot version must be the same ? or we allow compatibility between patch versions ? or minor ?).

My current guess is the simplest thing to do is to guarantee compatibility only between patch versions (so Godot 4.1.x is compatible with GDNative 4.1.y no matter x and y). This allow simple version check for plugins during initialization (we could even add a mandatory version field in the .gdnlib so that Godot does the compatibility check before loading the plugin). Of course this means no changes in the api (including everything that's exposed by ClassDB) between patch versions. We could enforce this in CI by comparing the generated api.json with a committed version.

ghost commented 3 years ago

however what about using GDNative header 4.1 with Godot 4.0 ? godot_foo would be called with more args than expected !

@touilleMan While this is a plausible scenario, I believe it's far more common to expect backwards compatibility than forward compatibility. This also appears to be the sentiment behind this related proposal regarding editor plugins: https://github.com/godotengine/godot-proposals/issues/1613. Also, calling godot_foo with more args would, at worst, leads to memory leaks and less-than-ideal (but defined and expected from the older engine version) behavior, which I believe is a much less serious problem compared to what would happen when 4.0 headers are used with 4.1.

...given it's going to be really common when a new Godot version is released to use plugin build with older version given the upstream maintainer hasn't released new builds yet

I'm not sure if I understood what you mean here correctly. Shouldn't this lead to the "4.0 headers with 4.1, less args" situation, if I'm not mistaken?

We could enforce this in CI by comparing the generated api.json with a committed version.

This would be really great for versioning purposes, even outside GDNative.


That's strange because from all the research we did, the problem really was the fact Godot initializes the refcount on its side when boxing it into Variant, really. Its not something the bindings do AFAIK

@Zylann Perhaps you can make create_custom_class_instance return either T* or Ref<T> depending on whether the base class is reference counted with templates? Then you can deal with reference count initialization properly in the function.

Zylann commented 3 years ago

@Zylann Perhaps you can make create_custom_class_instance return either T* or Ref depending on whether the base class is reference counted with templates?

That's one of the possible outcomes but it's the worst... I don't even know if I can do that, because then I need to duplicate the entire logic just to return something different, and would make code depending on it even more complicated (especially more with arguments!) because of the variation of return type. Would require the user to have different header declarations if inheriting Reference or not, would break the Ref<T>.instance() logic as well... new even requires the return type to be a pointer AFAIK so it's inconsistent. And all this because of a behavior that we don't need (the Variant step), and solves the problem if it was just straight up not there.

ghost commented 3 years ago

Sorry. Didn't mean to start an argument here. To be clear, I'm not against adding the new API: it would make our lives much better as well. I was merely suggesting the possibility of an issue in godot-cpp, but since I'm not as familiar with the inner workings and/or goals of the godot-cpp project, I could of course be very wrong. In any case, I meant no offense and feel sorry for any that I might have caused. If there is a good reason for the user-facing API to remain as-is, then I agree that it is hard to work around, and I'm not the one to judge whether that goal is a good one to pursue.

Zylann commented 3 years ago

Sorry if I sounded harsh, it wasnt the intention^^ your solution makes sense but as you said it would require significant changes in the way we expose the API to users, which we wanted to be unified and faithful to how it is when you write an engine module (which allows to easily mirror logic as well, since C++ is the language of the engine).

touilleMan commented 3 years ago

...given it's going to be really common when a new Godot version is released to use plugin build with older version given the upstream maintainer hasn't released new builds yet

I'm not sure if I understood what you mean here correctly. Shouldn't this lead to the "4.0 headers with 4.1, less args" situation, if I'm not mistaken?

@toasteater you're right, I messed things up ^^ So as you said the "regular" use case would be 1) people are using Godot x with a plugin using GDNative x 2) Godot x+1 is released, people start using it day one 3) plugin is updated by it maintainer to use GDNative x+1

So between 2) and 3) we get users using Godot x+1 with GDNative x However after 3) there is people still relying on Godot x (because they don't want to change version in the middle of a game dev, or because there game is a fork of Godot ?) but will update (or just start using) the plugin. I have no idea how common this thing is, however I'm pretty sure we need to address this issue in a clean way otherwise people will end up with hidden&nasty bug:

Consider the (hypothetical) String.split function working like "foo bar".split() == ["foo", "bar"], in next release we provide another parameter that allow to choose what is the splitting character (eg. "foo-bar".split("-") == ["foo", "bar"]) Now if we use "foo-bar".split("-") on an older version of Godot and the "-" param gets simply ignored (so we end up with "foo-bar".split("-") == ["foo-bar"] !), we have recipe for disaster ^^ I think it would be far better to introduce a mechanism to prevent the plugin from being loaded in the first place.

ghost commented 3 years ago

...I think it would be far better to introduce a mechanism to prevent the plugin from being loaded in the first place.

@touilleMan Then I think semver specs as was suggested in https://github.com/godotengine/godot-proposals/issues/1613 is a great option for that: it covers the "regular" use case well, and would also allow libraries to require a minimum minor/patch version of the engine that supports the extra features required. Then, it's possible to prevent the libraries from being loaded by older versions and warn the user, preventing the "ignored argument" situation from happening.

For example, suppose that the splitting character argument is added to the hypothetical String.split in Godot A.3.0. A plugin that makes use of the argument may set its godot_version spec to ^A.3.1, which would mean that:

In this case, no engine version that may not have the additional parameter may load the plugin in question.

It should also be possible to emit an error message when extra arguments are found (and ignored), so even in the case a plugin mis-specified its engine version requirement, the issue can at least be signaled to the end user (game developer).

Ansraer commented 3 years ago

Here is everything that was proposed and discussed during the meeting on 22.10.2020. Cleaned things up a bit and formatted it properly.

Changes to GDNative itself

Improvements to GDNative C++

Moving stuff to GDNative

Changes/Additions to the Editor

Feel free to ping me on discord if you feel I missed something or want me to edit this list. (My spam filter sometimes deletes GitHub notifications)

Zylann commented 3 years ago

I was just thinking of a use case that might be hairy to handle right:

Currently, I have a module that allows to subclass a "Generator" class to define a custom scripted one. One of the functions is called very often from inside threads. It passes a custom "VoxelBuffer" instance which the user fills with voxel data (it's a custom data structure so it doesn't use Godot's primitives). Currently this is easy to do if you use GDScript, C#, or C++ or Rust through GDNative. This is a very CPU-intensive task so performance is key.

But now imagine this module is ported as a C++ GDNative library. First, if you use GDScript following OOP principles, you would have to inherit the native class. I never tried that before. With C#, this gets harder. As mentionned before, without low-overhead C# API for GDNative libs, it won't perform well. But if you want to use C++, the same language as the GDNative lib? How would this work? How would your game's C++ code get the C++ representation of a class located in another GDNative lib? Would it have to go all the way around C++ => Godot => C++ just to call a native function of the same language? It could end up being done without using Godot at all, and writing a dynamic plugin directly using C++, which is very constraining. Or ending up as not compiling any library, and forking it instead. And what if you want to use Rust? With a module, all this is trivial. But as a lib, it looks like it would suffer both from API layers and cross-language difficulties (on setup, and on runtime perfs). It sounds like it would be so dumbed down that making games and plugins in C# would just be faster and easier. Besides I wonder how would this be setup to work reliably.

I wonder how UE4 deals with this. So far I suppose if you use C++ only, it could have all game code + plugins in source forms, and they get compiled into one single library, which then gets loaded by the engine. But for C# or Rust I have no idea.

ghost commented 3 years ago

Personally, I see this encapsulation as something usually positive, since it decouples the consumer code from the library implementation . With Godot acting as a mediator, it becomes a lot easier for library consumers to not care about what is used to write the underlying code. Nevertheless, I agree that there are cases where this can affect performance as Zylann has said.

It should be possible even today, though, to just use a C struct as the userdata, and publish the headers. You'll have to bypass the C++ bindings to do this, but I see that more as a limitation in the bindings rather than the GDNative API itself. The engine can, in the end, do no better than that, as it can also only get access to function pointers at runtime: by gaining the convenience of not having to recompile the engine, you have to trade in the ability to let the engine know about the API at compile time.

As a tangent, I wonder if the ability to operate on script instances directly can be expanded beyond just the userdata pointer. Right now, the code for creating a new instance of a NativeScript from a library is quite complicated since we have to go through a NativeScript resource, and there isn't an obvious way to move an existing NativeScript instance onto a new or existing object, without having a temporary one built first though the default constructor. Having an option to set_script and provide a userdata pointer in the same operation would be pretty nice.

vnen commented 2 years ago

Given we won't have GDNative since it's replaced by extensions, some of these aren't really relevant anymore. We need to recheck this list to see if something is still missing.

YuriSizov commented 1 year ago

I suggest opening new issues and dedicated proposals for problems and features that are still relevant under the new GDExtension system.