Closed willnationsdev closed 1 year ago
I did realize a few potential issues with this proposal, but they don't necessarily seem insurmountable.
Because structs are "copy-by-value" complex types, you can't store a reference to a Struct as its own "property". This, in turn, means that Structs can't really store other Structs within themselves. Or rather, if you tried to force it by having two 8-byte Structs within one Struct, each of the sub-Structs would spend 4 of their bytes just storing their type information and would only have room for one actual property. The rest of their data would be truncated by the wrapping Struct's logic. Therefore, allowing Structs within Structs is simply wasteful or worse confusing/misleading to end-users.
Because this implementation of Structs is so picky about how many actual bytes are allocated for specific properties, we will need to improve documentation (and perhaps give some sort of feedback indicator in the UI?) to more clearly indicate to users how many bytes of data are actually consumed by a given value or reference. For example, I would assume that String
, StringName
, Object*
, Array
, Dictionary
, and the Packed*Array
s are each a single pointer to data that exists somewhere else. And then those pointers would be 4 or 8 bytes for 32 and 64-bit devices respectively, depending on how that Godot build was compiled. When is it safe to use 2 or 4 of such properties in a Struct?
I thought of a potential workaround for these issues, but it comes with its own significant drawbacks, so feedback on improvements (or whether it is even worth it) would be appreciated:
Make a Struct Variant a proxy container with an ID that maps to backend Vector<T>
storage in the core for each kind of built-in type.
TYPE_STRUCT
with struct_type_id
for NamedPoint
. But now, the "value" of the Variant is just a uint64_t
ID number for the Struct instance.NamedPoint
configuration relies on the same concept of specifying PropertyInfo
s for members, but now the byte segmentation isn't significant as the full data for the Struct doesn't need to be kept in the Variant itself. As a result, you could potentially have an arbitrary number of properties within a Struct (though we may still need to impose a specific limit for performance reasons).x
on a NamedPoint
Struct, the StructServer would...
x
as the Nth index of NamedPoint
's member array (in the example case, N = 0).PropertyInfo
in question has TYPE_REAL
.PropertyInfo
a StructTypeInfo
supports. So, if structs can only have up to 4 properties, and so a StructTypeInfo
has a PropertyInfo[4]
like in the initial example, then the "value" of the Struct ID lookup operation would be a uint64_t[4]
. This subarray would be secondary indexes to corresponding backend arrays.x
had TYPE_REAL
and was in the 0th
slot of the StructTypeInfo
's member list, you'd take the 0th secondary index and plug it into the PackedRealArray
backend storage to fetch the value.Then, when creating any new instance of a Struct and based on the data types of the Struct type's members, Godot would need to identify and assign free slots in the backend arrays to the new Struct ID. Furthermore, we might need to provide an interface through the StructServer to allow users to declare plans to allocate a particular number of Struct types, thereby allowing the StructServer to resize corresponding backend arrays accordingly to prevent constant array growth during mass allocations.
You could then still preserve all other aspects of the Struct Variant. Copy-by-value assignment operations can be done purely with backend array overwrites without users needing to be aware of it. Scripts could still do struct type configuration. Though I'm no expert, I think File IO and network communication would probably need to deserialize the Variant data to larger-than-Variant-sized byte streams with a copy of the backend array data along with the instance ID and type info.
Pros:
Structs are no longer limited by the number of bytes available to an individual Variant. Therefore...
Successfully maintains most of the general requirements...
Cons:
Much larger data consumption per Struct instance. Now, since the data is stored in backend arrays, you need a mapping table of secondary indexes to find the data associated with each member. This results in an extra sizeof(uint64_t)
byte allocation for each total number of member properties P in a StructTypeInfo
. You also have to actually maintain allocated memory for the actual data separate from the Variant. If you kept a max of 4 properties as the initial proposal gave, you'd end up with something like...
So, this comment's changes would shift a NamedPoint from 20 bytes to 68 bytes (a very significant loss in efficiency). However, this is still more performant than the Array (80 bytes), Dictionary (160 bytes) and Object (~800 bytes) options we had available before, and the increase in usage flexibility might be worth the cost. Further discussion/investigation is required.
It decreases the performance of all symbol lookup operations for a Struct. The O(n) behavior of these operations also multiplies the O(n) of Struct creation linearly since new slots must be identified and allocated for each possible property.
Significantly increases the complexity and maintenance burden associated with the Struct Variant concept.
I wanna trying to implement this feature. I thinking about structs in Godot all the time
@SpectralDragon Heh, well, if you're going to spend time on it, I would suggest holding off on data representation as much as possible till discussion can come to some agreement on the best way to implement it. Or, if you don't wanna wait, start off with the simpler original proposal.
Ooh, just saw this proposal. I've had a problem where I need some primitive type which must be serializable but cannot be a Resource...
What I've been doing is using an untyped Array for lightweight "struct-type" objects. To avoid mixing these up with normal arrays, I use first element of null, followed by data.
For example, one of my struct-like types which represents a tuple of a string and two numbers might be [null, "object_name", 123, 4]
. The type check would be something like len(arr) == 4 and typeof(arr[0]) == TYPE_NIL and typeof(arr[1]) == TYPE_STRING
.
The workaround I picked seems to get me by for now, though I'd really be happy to see another way to do this.
I made a program for measuring memory consumption of various Godot types. It seems relevant for this issue. I hope it'd be useful. Link: https://downgit.github.io/#/home?url=https://github.com/hilfazer/Projects/tree/master/Godot/MemoryConsumption
It uses values from Performance singleton. Keep in mind the memory allocated by the OS will be higher.
I was reviewing Variant
briefly earlier today when I came across the Pools
struct which contains BucketSmall
, BucketMedium
, and BucketLarge
, each of which store unions of increasing size and is managed by static PagedAllocator<T>
instances. If I'm reading it right, these Pools
are the means by which Variant
is implemented to support managing memory for data structures that are larger than the bytes available solely within the Variant
itself.
This means there are 4 memory cascades associated with Variant
storage:
Variant
is local to itself in the 20 bytes (float reals) or 36 bytes (double reals) afforded to it.BucketSmall
with maximum 6 * sizeof(real_t)
bytes (24 or 48).BucketMedium
with maximum 12 * sizeof(real_t)
bytes (48 or 96).BucketLarge
with maximum 16 * sizeof(real_t)
bytes (64 or 128).These buckets are then used in the Variant::reference(...)
and Variant::_clear_internal()
methods plus the Variant(...)
custom constructors to handle the allocating and freeing of memory related to the "bucketed" types.
There are then a myriad of templated types and functions responsible for defining Variant
-to-Variant
logic in variant_construct.cpp
, variant_destruct.cpp
, variant_op.cpp
, and likely many more places. Of particular note are places like variant_setget.cpp
and variant_call.cpp
where macros are leveraged to mass-produce actual symbols that define specific structs or functions responsible for defining/accessing properties or defining/executing methods of the Variant
's stored subtype.
As far as I can tell, in order to make the proposal work as intended, you would have to take these sections that revolve entirely around creating large collections of statically defined relationships, and introduce a Script
-powered, dynamic dependency by which to have scripts intrude on this low-level defining layer to define their own types, set/get properties, and methods (via the Script
's static extension methods concept). Naturally, it's impossible to evaluate a Script
without first having already loaded all of this data regarding Variant
data/logic.
To make it possible, you'd need a subset of the available definitions/logic to be something that can be dynamically populated, cleared, and refreshed as needed. I'm not yet 100% sure whether that's even conceivable without restarting the engine runtime (my gut says "no"), in which case, I feel like this would grow into a sort of GDExtension-like feature at the "core" API layer where any time you make changes to a struct definition (its name, properties, constants, and/or "extension methods"), you would have to restart the engine or editor in order to see those changes reflected in Variant
logic.
You would need to have all of the struct definition info saved to a file (for which the definition could be populated from anything, including a GDScript
resource as proposed). Then, when initializing the Variant
runtime, it would deserialize that info and use it for a data-driven subset of the Variant
setup process. Then, because everything else is built on top of that, the only way to make updates would be to re-serialize the struct spec and reload the engine/editor to trigger the same Variant
runtime initialization logic that pulls from that data.
My thinking is that the proposed StructServer
would analyze the definition of the struct from a given script and then identify which bucket level (if any) would be necessary for storing the memory for that object. And then users would just need to be aware of what the memory thresholds were, what the maximum amount of storage on a Variant
is (96 or 128 bytes), etc. This gives you the best of both worlds from the two different versions I was thinking of since the engine already internally implements "version 2" on its own and even includes varying bucket sizes, including no bucket at all. We just need to have a way of dynamically populating the data from GDExtension and/or Script
definitions.
If anyone spots any problems in my line of thinking, please let me know.
As far as I can tell, in order to make the proposal work as intended, you would have to take these sections that revolve entirely around creating large collections of statically defined relationships, and introduce a
Script
-powered, dynamic dependency by which to have scripts intrude on this low-level defining layer to define their own types, set/get properties, and methods (via theScript
's static extension methods concept). Naturally, it's impossible to evaluate aScript
without first having already loaded all of this data regardingVariant
data/logic.To make it possible, you'd need a subset of the available definitions/logic to be something that can be dynamically populated, cleared, and refreshed as needed. I'm not yet 100% sure whether that's even conceivable without restarting the engine runtime (my gut says "no"), in which case, I feel like this would grow into a sort of GDExtension-like feature at the "core" API layer where any time you make changes to a struct definition (its name, properties, constants, and/or "extension methods"), you would have to restart the engine or editor in order to see those changes reflected in
Variant
logic.
Then if I've understood correctly, the only way to have structs properly well implemented would be to re-design the whole Variant logic, which is a big and core aspect of the engine, which in any case wouldn't happen before Godot 5.x?
What about letting the user specify the size tier for the struct? (instead of do it automatically; it is less user-friendly but implementation could be more straightforward [?])
Would need an accessible way to know how many bytes is taking a struct by hovering the mouse over the struct or by using the profiler (for both in-editor and run-time), to know if must upgrade a struct to the next size tier; could use compiler error/warning messages for boundary overflows.
struct <STRUCT_BODY> (without bucket)
struct small OR struct2
struct medium OR struct3
struct large OR struct4
or by using @annotations
May have misunderstood though.
@KiwwiPity
Then if I've understood correctly, the only way to have structs properly well implemented would be to re-design the whole Variant logic, which is a big and core aspect of the engine, which in any case wouldn't happen before Godot 5.x?
I can't say for certain as I still need to analyze it all in more depth, but I don't think the changes would involve some sort of drastic re-design of the entirety of the Variant logic itself. In the grand scheme of things, you'd just be adding a single new type to the set of types that a variant could be, and thus you'd be creating whatever other type-to-type operators and constructors are necessary to teach the Variant how to work with the new type (no different from adding any other type to Variant). It's just that the internal handling of the logic for those operators, constructors, and the like would need to be data-driven rather than statically hardcoded in C++. We'd need to branch their logic based on data we load ahead of time from the filesystem just like GDExtensions in the core API layer.
And as far as I can tell, doing so wouldn't necessarily involve breaking any compatibility with earlier versions of Godot, so I do not foresee it requiring a Godot 5.x major version upgrade to be possible to implement (could be wrong though).
What about letting the user specify the size tier for the struct? (instead of do it automatically; it is less user-friendly but implementation could be more straightforward [?])
Would it be more straightforward? Probably. Would it make any difference in the constraints/limitations from an engine perspective? No. Therefore, I would still opt to have the engine just automate that process by default. Might expose a means for power users to optionally give byte size suggestions to the engine though.
Would need an accessible way to know how many bytes is taking a struct by hovering the mouse over the struct or by using the profiler (for both in-editor and run-time), to know if must upgrade a struct to the next size tier; could use compiler error/warning messages for boundary overflows.
Agreed. The auto-calculation/estimation of byte size would need to be something that happens at design-time, thereby allowing script editors to report compile/parse errors to the user when the structs they are defining would exceed the engine's available byte limitations.
I started prototyping this proposal with a StructDB & Struct pair of types that mimic the definitions of ClassDB and Object a little, but with various things stripped out and struct-specific things added in. In the process, a few things have come to my attention.
First, you can't merely reuse the Variant type's bits to save space in defining type information for structs; doing so would make a standalone struct (outside of a Variant wrapper) be unable to report what type of struct its data contains. So scrap that idea. We'll just have to sacrifice a portion of the 16 bytes in a minimal struct to store metadata about the struct instance. That can then include not only the type ID, but also the "bucket" size of the struct instance (so things don't have to query the centralized StructDB just to know the size of its data).
Second, in order to properly recreate Object-like behavior for structs (defining structs both in engine code, in extensions, and in scripts with method bindings), information about a struct's type, property names, and property types HAVE to be stored alongside the property values themselves otherwise we run the risk of data corruption upon deserialization if the struct's type definition has changed (e.g. user swaps variable names / reorders properties). Given that, I'm considering that we could have one serialization strategy that efficiently compacts the struct data for transit with just the struct type and property values (for use at runtime) and another with much more detail that is used specifically when storing struct data in long-term storage like a file or database. Not sure yet if there are any problems with that approach.
Other than that, I don't have anything to show off. Progress is slow-going due to time constraints on my end. Just wanted to give an update here for folks interested in this proposal.
@willnationsdev this is supposed to be a PR for Godot 4.x?
@KiwwiPity This proposal is a description of how a potential pull request for Godot 4.x might be written, yes. The prototype I mentioned above is my attempt to see if I can implement this feature in a barebones format. An MVP would likely entail being able to define a PropertyInfo
struct accessed with get_property_list()
and then implement exposure of engine-defined structs in GDScript. If that works, then I may try to get that merged as a pull request. After that would be adding support for script-defined structs and then adding GDScript support for defining structs.
@KiwwiPity This proposal is a description of how a potential pull request for Godot 4.x might be written, yes. The prototype I mentioned above is my attempt to see if I can implement this feature in a barebones format. An MVP would likely entail being able to define a
PropertyInfo
struct accessed withget_property_list()
and then implement exposure of engine-defined structs in GDScript. If that works, then I may try to get that merged as a pull request. After that would be adding support for script-defined structs and then adding GDScript support for defining structs.
About the stucts, would be great to have them with GDScript.
Maybe the devs would opt the way to just use inner classes though [class
on GDScript.]
GDREGISTER_NATIVE_STRUCT
), tho it is really limited: They only have a name, size and c code that defines the contents.void*
given some metadata in the form of a PropertyInfo
and a GodotTypeInfo::Metadata
Based on those points I would extend ClassDB::register_native_struct
to add a list of name, offset, PropertyInfo and Metadata for each member of the struct, and thus allow access to the contents without having to involve a a c compiler to determine the actual layout of the struct.
Then based on those enhanced native structs I would then build struct variants, by adding a variant type STRUCT
and then use the padding between type
and _data
to store the struct type as an opaque id. The actual contents of the struct are stored either inline in the _data
field or on the heap depending on the struct size. For performance reasons it may be desirable to encode if the struct is inline and maybe even the size in the id directly or by using a second variant type.
Personally (at least for the first version) I would not allow defining methods on structs, and just keep them as dumb, but efficient data containers.
The biggest issue with this is probably going to be dealing with user (script) defined structs and modifications to them. There I would probably just end up always defining a new struct type instead of changing the layout of existing types and then doing some editor magic to convert instances of the old type, or just instances of these structs to exist in the editor at all and thus not have to deal with the changing layout at all. For serialization I would use a scheme similar to objects and serialize them as key value pairs to avoid these layout issues. Finally at runtime of a game I would disallow defining and changing struct types entirely, since that is probably not going to be useful very often and the overhead to make it work would just no be worth it. EDIT: As a first version defining custom structs in project settings and requiring an editor restart to apply changes would probably get rid of most of the complications around user defined structs (except for layout, but that is really the smallest issues here)
@RedworkDE Thank you for this information. I'll have to take an in-depth look at that as it may render a lot of the stuff I was starting to do unnecessary. I appreciate it.
Just came across this proposal and I wanted to voice my support for it :)
I don't follow all the implementation details, but I'll add that it would be great if the struct could be edited easily from the editor inspector, the same way that a Resource can be. In my projects, I like have a layer of separation between my data and my behavior, but Resources allow for behavior, which can be potentially dangerous to deserialize. Having a data structure that is limited to just data would be great!
Hi @willnationsdev ! I might be interested in helping implement this feature if you want any help coding, testing, or theory crafting. Feel free to reach out on here or RocketChat if you want to get in touch :)
@nlupugla That would be a welcome effort if you are up for it. I have a structs
branch on my godot fork where all the work has been done, but it's very much a WIP/incremental patchwork job thus far. I haven't had time to work on it for several months now due to a large task with a hard deadline at my day job, but you are welcome to review and/or build off of what I've done thus far.
In short, I made a base struct class with 5 derived versions that each account for multiple "tiers" of allocating memory for a single struct instance: StructMinimal (uses no memory beyond the Variant), StructSmall/StructMedium/StructLarge (uses the PagedAllocator buckets just like other Variant sub-types), and iirc a StructCustom (for users that want to write their own implementation). The plan was to then gradually duplicate the existing Object
-related logic in Variant and the related classes across the core so that they treat each struct instance in more or less the same way with the sole exception being the way memory is initially allocated and cleaned up.
However, it's been ages since I've worked on it and I'm pretty sure a lot of the changes thus far are either obsolete or work-in-progress things that need to be cleaned up.
Great! I'll take a look through your branch @willnationsdev when I get the chance.
I read through the discussion above a bit more carefully, and I have to say, I like where @RedworkDE is coming from.
I don't like the idea of having to make an entire StructDB that lives along side the ClassDB. I get the desire for having the structs be as memory-efficient as possible, but I think code complexity is super important as well. The more we can build on what's already there, the better.
I very much agree that structs should not allow user-defined methods. I think they should be as close to Plain Old Data as possible. I could see an argument for allowing operator overloading and/or static methods, but I think it's very important to be able to deserialize structs without the risk of arbitrary code execution, which is one of the pitfalls of Object
.
The one thing I disagree on here is having to go through project settings to define structs. It would be sweet if structs could be anonymous and be scripted inline. For example,
class_name PointGenerator extends Node
var points_created : int = 0
func _ready() -> void:
for i in range(4):
var point : (name : String, x : float, y : float) = generate_random_named_point()
print(point.name, point.x, point.y)
func generate_random_named_point() -> (name : String, x : float, y : float):
points_created += 1
name = "point" + str(points_created)
x = randf()
y = randf()
return (name : name, x : x, y : y)
That said, I see this as more of a "nice to have" than a "need to have".
Closing in favor of https://github.com/godotengine/godot-proposals/issues/7329. Thanks for the original discussion nonetheless :slightly_smiling_face:
Describe the project you are working on
A simulation game with many small custom data structures for web/mobile.
Describe the problem or limitation you are having in your project
With script code, you either have only Variant-compatible stuff or have access to non-Variant data structures (C++/C#) that must eventually map to Variant anyway (exporting variables, serialization, etc.). All of the user-friendly options in the scripting API have poor control over allocation size for complex data types. For example, a NamedPoint type:
Assuming Variant is 20 bytes (4 type enum, 16 data), representing a
NamedPoint
value has the following options:2^2
bytes for minimum Array size. Still have bad UX of needing index offsets to access each member with a declared enum.2^3
minimum size of Dictionary. But, now we can use strings/symbols to look up members, and the Inspector renders the keys when exported, so that's better UX.Object
class.Ideally, we want to get as close to 20 bytes as possible with automatic serialization, type safety, editor integration, and named labels for members.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
TYPE_STRUCT
state to Variant.Type type
member from 4 to 2.uint16_t struct_type_id
member to fill in the lost 2 bytes.StructServer
singleton that maintains a config-based array ofStructTypeInfo
objects.struct_type_id
is an index to fetch a specificStructTypeInfo
.StructTypeInfo
can be defined multiple ways:ScriptLanguage::get_global_class_name(path)
). For example, a GDScript with the@struct
class annotation. A compatible script would be subject to limitations:Script.get_script_property_list()
, so instance properties from_get_property_list()
have no meaning.All of the above script limitations would have to be separately implemented and maintained by each respective scripting language that has added support for defining struct types. That is, the parsing/compiling logic of each language would need custom logic for verifying that the parsed/compiled class definition abides by the policies.
You would also separately need to teach each language and the editor how to handle Struct Variants.
By default, the assignment operator would copy-by-value, creating a new Variant with the same
_mem
andtype
/struct_type_id
. Other operators would result in an error until a legitimate means of overloading them is devised.In practice, Structs would allow script users to define small complex data types that are created with much smaller memory footprints (exactly 20 bytes) and without sacrificing named member lookup, type safety, or automatic editor integration.
You'd get full type safety in GDScript/VisualScript, and configuration-based "safety" with runtime checks and casting in compiled static languages like C++ and C#. We'd need to expose a StructServer API of some sort to provide a high-level API for handling the actual type/byte-segment lookup operations that GDScript and VisualScript could handle for users out-of-the-box.
In addition, a Variant-compatible lightweight Struct type could fix and improve many other things.
_get_property_list()
, etc.Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
Tentative sample of engine changes:
type
property and the addition of a new property that must be preserved during serialization.Also, expose the StructServer singleton to provide a high-level interface for working with Struct Variants. Would be necessary to use in engine C++ and optional in C#, both with string-based lookups.
Usage of a Struct type in GDScript. The language automatically handles conversions:
Languages that support struct definitions could look like this:
(if, say,
_op_add
were a magic method for variant op overloading)Example in GDScript:
Example in C#:
If this enhancement will not be used often, can it be worked around with a few lines of script?
Web and mobile are memory-sensitive platforms by definition, so improving options available for memory conservation opens more options for what can be done with vanilla, script-only Godot projects. Based on mentioned examples, scripted workarounds are either too verbose and inefficient or easy-to-use, but really inefficient.
Is there a reason why this should be core and not an add-on in the asset library?
The memory enhancements require revisions to what Godot Engine internally executes when performing Variant serialization/initialization. Therefore, the customization features must be made at the Variant level which is a core feature. If things were done with an asset, it would defeat the purpose; the enhancements would only exist for a specific language and would be unavailable to other languages like GDScript and VisualScript, nor would they be available to the Editor.