godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
88.96k stars 20.17k forks source link

Bugs with GDExtension API validation logic (`--validate-extension-api` command and CI) #95858

Open akien-mga opened 3 weeks ago

akien-mga commented 3 weeks ago

Tested versions

System information

Any

Issue description

We have some bugs in the GDExtension API validation checks that we perform on CI, which are starting to hamper development as we need to work them around in non-trivial ways.

90993 is a good example of that as it's impacting a complex method, RenderingDevice::draw_list_begin which changed multiple times since Godot 4.0:

To make CI checks pass, I had to add changes to 4.0-stable_4.1-stable.expected and 4.2-stable_4.3-stable.expected that reflects changes done in 4.4, which seems wrong to me. And also contradicts previous messages in the same .expected files for the API breakage done in their respective versions.

diff --git a/misc/extension_api_validation/4.0-stable_4.1-stable.expected b/misc/extension_api_validation/4.0-stable_4.1-stable.expected
index 5c3bf07fb2..2b7f9dc662 100644
--- a/misc/extension_api_validation/4.0-stable_4.1-stable.expected
+++ b/misc/extension_api_validation/4.0-stable_4.1-stable.expected
@@ -349,3 +349,11 @@ Validate extension JSON: Error: Hash changed for 'classes/EditorUndoRedoManager/
 Validate extension JSON: Error: Hash changed for 'classes/UndoRedo/methods/create_action', from 0AEC1BFC to E87757EB. This means that the function has changed and no compatibility function was provided.

 Added a optional parameters with default values. No adjustments should be necessary.
+
+
+GH-90993
+--------
+Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "Array" to "int".
+
+FIXME: Hack to workaround a bug in our validation logic with a method that changed arguments count, type, and default values multiple times.
+The actual last change triggering this error happened in 4.4 (GH-90993).
diff --git a/misc/extension_api_validation/4.2-stable_4.3-stable.expected b/misc/extension_api_validation/4.2-stable_4.3-stable.expected
index ce8f24c7a9..3de9736c8d 100644
--- a/misc/extension_api_validation/4.2-stable_4.3-stable.expected
+++ b/misc/extension_api_validation/4.2-stable_4.3-stable.expected
@@ -380,3 +380,12 @@ Validate extension JSON: Error: Field 'classes/Image/methods/get_mipmap_offset/r

 Type changed to int64_t to support baking large lightmaps.
 No compatibility method needed, both GDExtension and C# generate it as int64_t anyway.
+
+
+GH-90993
+--------
+Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': default_value changed value in new API, from "Array[RID]([])" to "0".
+Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments/9': type changed value in new API, from "typedarray::RID" to "int".
+
+FIXME: Hack to workaround a bug in our validation logic with a method that changed arguments count, type, and default values multiple times.
+The actual last change triggering this error happened in 4.4 (GH-90993).
diff --git a/misc/extension_api_validation/4.3-stable.expected b/misc/extension_api_validation/4.3-stable.expected
index 24c7702090..5c869d9914 100644
--- a/misc/extension_api_validation/4.3-stable.expected
+++ b/misc/extension_api_validation/4.3-stable.expected
@@ -14,3 +14,11 @@ Validate extension JSON: Error: Field 'classes/ShapeCast2D/properties/collision_
 Validate extension JSON: Error: Field 'classes/ShapeCast3D/properties/collision_result': getter changed value in new API, from "_get_collision_result" to &"get_collision_result".

 These getters have been renamed to expose them. GDExtension language bindings couldn't have exposed these properties before.
+
+
+GH-90993
+--------
+Validate extension JSON: Error: Field 'classes/RenderingDevice/methods/draw_list_begin/arguments': size changed value in new API, from 9 to 10.
+
+draw_list_begin added a new optional debug argument called breadcrumb.
+There used to be an RID argument as arg #9 which was removed in GH-84976 but now we're adding a new one in the same location.

Aside from the above issue, we have a few other long-running problems that trigger recurring warnings in the logs, but somehow don't seem to break the CI checks, or we'd have fixed them ages ago:

ERROR: New API lacks base array: enums
   at: compare_dict_array (./core/extension/extension_api_dump.cpp:1362)

Happens when checking the 4.0-stable_4.1-stable.expected and 4.1-stable_4.2-stable.expected files.

ERROR: Validate extension JSON: Missing field in current API 'classes/RenderingDevice/methods/draw_list_end': arguments. This is a bug.
   at: compare_dict_array (./core/extension/extension_api_dump.cpp:1443)
ERROR: Validate extension JSON: Missing field in current API 'classes/RenderingDevice/methods/compute_list_begin': arguments. This is a bug.
   at: compare_dict_array (./core/extension/extension_api_dump.cpp:1443)
ERROR: Validate extension JSON: Missing field in current API 'classes/RenderingDevice/methods/compute_list_end': arguments. This is a bug.
   at: compare_dict_array (./core/extension/extension_api_dump.cpp:1443)

Happens for all stable version checks, but somehow not in the dev version (currently 4.3-stable.expected). This was the same in the previous dev cycle where it didn't occur for 4.2-stable.expected, so it's not that it was fixed in 4.3 / 4.4.dev.

Steps to reproduce

./misc/scripts/validate_extension_api.sh godot

Minimal reproduction project (MRP)

n/a

akien-mga commented 3 weeks ago

CC @RedworkDE @dsnopek @raulsntos

dsnopek commented 3 weeks ago

I've been looking into what's happening with PR #90993 and trying to figure out how the validate_extension_api.sh script is supposed to work.

It looks like it will concatenate all the *.expected files between the reference version (the version whose extension_api.json it's validating) and the current version, in order to attempt to account for all the progressive changes since the reference version.

For example, when the reference version is 4.1-stable, it'll concatenate the following *.expected files:

This should theoretically work, but has problems specifically with adding/removing arguments, because the validation messages will change. If the property had simply changed type in each release, we would have been fine, because all the validation messages about the type changes would be in the concatenated file, and all would be ignored.

Here's the first bunch of solutions that come to my mind:

  1. Put the additional type change messages in 4.3-stable.expected, rather than scattering them into the old 4.2-stable_4.3-stable.expected etc files like @akien-mga describes as his workaround above. We really shouldn't change the old *.expected files after we've moved on to the next minor version, and adding the messages in 4.3-stable.expected would work because it'll always be concatenated together with the other files. This isn't great, because it's not obvious that this needs to be done, but it's also not the worst and we can document it.
  2. Attempt to make the script smarter and recognize removed arguments. Like, we could look for messages about an arguments size change, and then automatically ignore any validation messages that refer to indexes higher than the size change. However, this wouldn't be a lot of fun to do in a shell script and with our relatively unstructured data. Completely reworking how this process works (like moving to Python and/or using more structured data for the *.expected files) seems like it'd be too much.
  3. Rethink which things we consider to be errors. From the perspective of GDExtension compatibility, only a few things really matter like function hashes changing, constant values changing, and maybe a couple other things. I know the argument count and default values affect GDScript and C#, so the errors from this script can be helpful in calling out these changes for reviewers. But do we actually need them? If we could live without, then maybe we stop considering so many things to be errors in general?

There's probably more possible solutions that I haven't thought of.

dsnopek commented 3 weeks ago

Relating to my possible solution nr 1 above:

@akien-mga mentioned on #90993 that putting those messages in the 4.3-stable.expected will lead the validate_extension_api.sh script to emit warnings, since those messages won't occur in the output when the reference version is 4.3-stable.

Those warnings can help prevent unused messages from being in *.expected files, which is generally a good thing: perhaps a message gets added accidentally, which does nothing at first, but then later ends up ignoring a message we really actually care about.

However, if we do intend to go with possible solution nr 1, then those warnings risk becoming noise: we'll have to know to ignore the ones that we put in there on purpose, while still somehow watching out for new ones that could potentially lead to problems in the future.

A couple options come to mind:

  1. We just get rid of those warnings. The odds of the situation they are designed to protect is pretty low.
  2. We add some kind of special annotation we can use in the *.expected files to silence that warning for specific messages. We would need to be careful when reviewing to ensure it's only used in these tricky situations with arguments being added/removed, but it would allow us to continue getting value from the warnings.