godotengine / godot-proposals

Godot Improvement Proposals (GIPs)
MIT License
1.16k stars 97 forks source link

Add an abstract type system for GDScript to gain access to compile-time code execution #1709

Open PranavSK opened 4 years ago

PranavSK commented 4 years ago

Describe the project you are working on: An editor plugin. But this can help all projects using GDScript.

Describe the problem or limitation you are having in your project: GDScript being a duck-typed language with a lot of reliance on strings, using it was sometimes very cumbersome and non-intuitive. Also, this sometimes meant not being able to use our favourite text editors. Many of the recent changes to the GDScript has been addressing this (class_name, optional-typing, signals) but it does not affect all use cases and does not make sense to radically change GDScript to a fully typed language. Nevertheless, now that optional typing has performance benefits, it makes sense to be able to take advantage of this while still having the generic nature (especially for addon and plugin development in GDScript).

Describe the feature/enhancement and how it helps to overcome the problem or limitation: The idea is to introduce an Abstract Type System. It is loosely based on what the Haxe language does, but more streamlined towards Godot usage. The basic notion is the concept of being able to create a type that is transformed into a different type at compile time.

Let me illustrate with a few examples:

  1. The dictionary in GDScript is very easy and a common way to store data. In many ways, it replaces the structs found in other languages. But one of the biggest annoyance is not having code completion for the dictionary keys (especially when from the user point of view he is creating a static DB). So with abstract types, we could have something like this:

    abstract Dictionary # The underlying type - it could be a concrete type or another abstract type.
    class_name EnemyStat
    var health: float: # Using the new properties syntax in GDScript
    get:
        return self.get("health")
    set(value):
        self.set("health", value)

    The health would be resolved at compile time to provide the benefits of typed optimizations while allowing the core dictionary implementation to be generic. Also, this would make value carrying enum a possibility.

  2. A more complex example - Let's say I have written a generic GDNative plugin that is able to load a C - compliant DLL and can resolve the structs from that DLL into native types that are meaningful to Godot. You make a single large class SomeProjNative, which essentially just has wrappers into the required functions. The from within GDScript you could write an abstract type to make it more useful inside GDScript.

    
    abstract Node
    class_name SomeWrapper

func do_something(): return SomeProjNative.do_something()

The compiler would essentially replace all calls to do_something with SomeProjNative.do_something()

Of course, this is now possible with GDNative but requires you to create the interface completely inside the native code and making changes cumbersome. Also, GDScript is a lot easier to write code. If, as some external tool/library developer needs to support Godot I can simply do a single class with the required API translations. The library users can adapt it to however they want inside GDScript.

3. Introducing the idea of compile-time expressions. Any expression within a file which starts with `abstract SomeBase` is subjected to compile-time execution, if it can be resolved. Eg.

abstract Foo

func do_foo(): if is_editor_hint(): # This can be resolved at compile time do_foo_debug() # This is cannot be resolved at compile time. The do_foo() call is replaced with this if the above check is satisfied

If the above check fails the do_foo is replaced with no op.


This would also extend to the core code as well.

4. The current use of physics ray casting returns a dictionary. For example, using [Physics2DDirectSPaceState.intersect_ray](https://docs.godotengine.org/en/stable/classes/class_physics2ddirectspacestate.html#class-physics2ddirectspacestate-method-intersect-ray) returns a dictionary with a number of key-value pairs. But what each is not immediately relevant and often you have to look up the docs. But with the idea of an abstract dictionary (this should be possible to be done only in the context of GDScript, as other languages are better treated with their nuances) the code completion should be able to prompt available keys. 

Also in many other parts of the engine, strings are extensively used. This is often prompted by the GDScript inbuilt editor, like animation clip names, preload resource string path and most importantly the `NodePath`s. Since the editor is already tracking these, we could put this data into abstract types and make it accessible, via the language server, to external text editors (I understand these are done context-sensitive i.e, which scene is open, but maybe we could cache these. Need more discussion into this). This could make it possible to have a sense of *namespace* using the folder structures. The current implementation of `class_name` sort of makes the scripts be meaningfully placed in its own *' scripts'* folder but this is against the Godot philosophy ob being able to place the scripts (and other resources) along with the scene file where it used, which greatly improves reusability of these scenes (especially makes a lot of sense for the addons folder stuff).

**Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:**
I am unfortunately not fully aware of how the internals work and hence need help to provide a concrete implementation idea.

**If this enhancement will not be used often, can it be worked around with a few lines of script?:**
Not really.

**Is there a reason why this should be core and not an add-on in the asset library?:**
It is fundamentally a new feature that needs to be in core.
vnen commented 3 years ago

I don't think I understand this proposal. It seems to be conflating two ideas into on proposal (the abstract type system and the compile-time optimization) which are really unrelated. We could this optimization without needing the abstract part.

Can you make a short concrete example of this in practice, with a full code sample?

PranavSK commented 3 years ago

Ok so my understanding of how it currently works may be wrong since GDScript is dynamic, so let me first confirm that.

The game on starting would encounter all 'preload' assets and load them into memory. This includes the scripts that have the class_name defined in them. Next, it loads up the resources relevant to autoloads and finally the start scene. The scripts of scenes that are loaded dynamically are loaded lazily at first encounter and unloaded when the reference count reaches 0.

When a script is being loaded it first tokenizes the strings and parses these tokens. These are stored according to properties and methods. This would be stored as a call to a native engine property/method or to another script (which would eventually chain back to a native call or simple expression?). My idea of compile-time is this process when the calls are resolved.

When a script instance is executed, these calls are then made as encountered. For example some_dict["some_key"] would be stored as a call to operator[] and another call to execute the expression, which in this case is just a string. The actual value of "some_key" would be sent when the script instance is executed.

So with that out of the way the basic idea is as follows: Access the compile-time to change what is the actual call that is stored. This could be done via additional keywords and reserved tokens etc., but since we already have a good dynamic language i.e, GDScript itself could be used to make these redirections. Hence the abstract type system - would enable the use of scripts to run at compile-time which is written using GDScript itself and hence should be easier to maintain (I could be very wrong about this). The abstract types are resolved 1st and during the compile-time these are executed. If during execution they encounter a call that requires run-time data then the original call is replaced with this call(s).

PranavSK commented 3 years ago

Just to be clear the idea is not to change how the operator[] is called or to change anything in the core engine code. The idea is to allow some users, when writing code, be able to use SomeTempName.expect_some_vector2_value which would be stored as some_dict["some_key"]. During the actual call, the native code need not even check if the return value is Vector2. Since the user has created SomeTempName we can expect he will use the same to assign values. This means the optional typing would enforce this when the SomeTempName is encountered (same as what is currently done) and if the user directly uses some_dict he understands that the type system advantage is lost (which is the default behaviour for dictionary types).

The main reason for introducing the abstract type system is to take full advantage of the GDScript setup that currently exists during compile-time. I think calling this an abstract type system may be misleading. We could otherwise call it a Macro system.

willnationsdev commented 3 years ago

This sounds like you're trying to have parse-time-only data types in GDScript which, at editor-time (tool script) and runtime (game) break down to an underlying Variant-compatible data type. The main advantage of which you'd be aiming for being the achievement of improved type safety and autocompletion when writing GDScript code. Is this correct?

abstract Dictionary # The underlying type - it could be a concrete type or another abstract type.
class_name EnemyStat
var health:

Also, this would make value carrying enum a possibility

The improved struct and enum usability is where something like this could really shine, indeed. Although, I agree with vnen, the whole typed instructions aspect is really a perpendicular topic that is unrelated to something like this. Or rather, the proposal, as it is, is not presented in a way that caters to an effective understanding of how things work under-the-hood (understandable, given your stated experience with it).

I think calling this an abstract type system may be misleading. We could otherwise call it a Macro system.

This is an accurate assessment. Especially the use of the abstract term as a new keyword would be very misleading since it insinuates that the data type in question is "abstract" and therefore cannot be properly instantiated (typical programming nomenclature is that, for abstract types, only derived, concrete, non-abstract data types can be instantiated). What you really seem to want is a GDScript-only parse-time macro system based on type declarations. But for that to work, you'd also need a clear way of mapping the keys of one data type to the realities of the underlying data type.

A more complex example

abstract Node
class_name SomeWrapper

I don't really understand this use case. If you want to make a wrapper, then either you are intentionally wrapping the logic of the internal class in order to add peripheral operations (in which case, you DON'T want a macro, but actually different logic executed) or you are just trying to save time by creating a GDScript class that exposes all of the same methods as the equivalent NativeScript class (which might(?) already be doable by overriding the call and/or callv methods and then calling the corresponding method on the internal class. I know there's a _get and _set, but there isn't an _call that is exposed for reasons I forget. But you can at least call any method on the NativeScript class by just passing in a string with its .call and callv methods, available to all Object types. And you could presumably leverage that to create an auto-wrapper from one language to another. Still, why you would want to do that still confuses me. It would definitely be doable with properties rather than methods though (_get, _set, and _get_property_list support that kind of thing).

Introducing the idea of compile-time expressions.

The code example you share here can already be emulated with a tool script perfectly. If the logic for the block under the editor hint check wouldn't work in the other context, it doesn't matter since GDScript only executes whatever logic it actually attempts to run. That is, if your logic check prevents moving into that code block, then it won't call the function and there's no need to worry something will run in an inappropriate context.

Also in many other parts of the engine, strings are extensively used.

This'll get better to some degree in Godot 4 with Callback and Signal becoming part of Variant. Any references to methods or signals can now be handled with symbols rather than strings.

This is often prompted by the GDScript inbuilt editor, like animation clip names, preload resource string path and most importantly the NodePaths. Since the editor is already tracking these, we could put this data into abstract types and make it accessible, via the language server, to external text editors (I understand these are done context-sensitive i.e, which scene is open, but maybe we could cache these. Need more discussion into this).

I don't understand why this requires an abstract type system to make the string data available via a language server of some kind. The strings can simply be exposed to the language server and triggered in the appropriate context by the language server, similar to how string-based JSON documents in VS Code can get autocompletion based on an existing JSON doc specification. The removal of stringification doesn't have any relevance to autocompletion support when it comes to language servers.

This could make it possible to have a sense of namespace using the folder structures.

Generally speaking, core devs tend to see folders as a namespace already, i.e. the filesystem is a natural namespacing mechanism when assets and data types are, themselves, referenced via file paths. Ergo, this is a solution in search of a problem that doesn't quite exist (according to conversations I've had anyway).

The current implementation of class_name sort of makes the scripts be meaningfully placed in its own ' scripts' folder but this is against the Godot philosophy ob being able to place the scripts (and other resources) along with the scene file where it used, which greatly improves reusability of these scenes (especially makes a lot of sense for the addons folder stuff).

A little confused here. You're saying class_name violates the concept of namespace organization of assets relative to files, but then simultaneously comment on how scripts can be referenced relative to the scene file where they are used. It illustrates the earlier point. You have the option of using a global type name or the option of using a local file and manually importing a reference to it.

Regardless, I have also discussed with reduz the possibility of introducing a namespace system for the global script class system (the class_name feature, as you put it), but he shut it down in no uncertain terms. So Godot, as an overall engine, will never have a cross-language namespaced naming system for Script resources. The most you can do is add prefixes to your class names (e.g. UI_CircleRect or write custom editor tools that parse your scripts' file locations and generate ad-hoc "namespace" scripts. For an example of the latter, see below:

# circle_rect.gd
extends Control

# ui.gd
class_name UI
extends Reference
const CircleRect = preload("circle_rect.gd")

# some_node.gd
extends Node
func _ready():
    add_child(UI.CircleRect.new())

Regardless, I don't personally like the idea of creating a GDScript-specific solution to structs and enums like this. I think it would be better if the Variant API itself had basic support for structs as a data type where users could customize the structure of the data, ensure the data is locally allocated, and still provide full autocompletion for all keys present in the data type (potentially with typed instruction handling).

For example (haven't thought this through, but...), you could have some kind of property hint info that contains a byte array Variant, a property hint indicating that it's a struct, and a property hint string that contains a breakdown of the keys, data types, and bit distributions of the data in the struct, up to 16 bytes (since Variants make up 20 bytes and 4 are reserved for the type information). Then, all user interfaces would display the "struct" similar to an object with those property names while setters and getters would directly update the appropriate bits in the byte array without any otherwise special logic. You could potentially even have a "class name" for the struct. Perhaps also a StructServer singleton that maintains information about all struct formats that have been defined so that autocompletion can be queried on an arbitrary basis rather than for a specific property. A system like this would be something that is available to all scripting languages and can have deeper integration with the user interface of the Godot Editor itself. Not saying it's a good idea. Just trying to point in a better direction.

A macro system won't achieve this, and thus far I'm not convinced that a macro system is even the best solution to the use cases presented here. But I'm just one lonesome voice. Perhaps others will think otherwise.