Closed Ruin0x11 closed 3 years ago
Perhaps we could also have an iter_locations(map)
function for iterating every single thing that looks like a location in the map, and testing its performance.
binser
keeps track of which class metatable to assign by comparing a name string. Right now we use the name the user passes in to class.class
. But this is bad, since you can have more that one class with the same name but different require paths. The moment a mod happens to add a class that happens to share the same name with another class, it will break. Instead we should glean the classes' require paths and use that for comparison.
But now we also have to consider what happens if the mod refactors a class file by moving it under a different require path, but they're still referring to the same class. If the user upgrades the mod then they would get errors about an unregistered class object. Given we need some way of running save upgrade logic anyway, we could let the user declare that the class file was moved for the purpose of serialization. Or we could prevent people from using old saves with newer mod versions in some cases. Or we could actually standardize a schema for everything that can be serialized, redesigning the entire serialization system in the process, so the user is able to see when the serialization code needs to be changed if they add or remove a field, instead of having it fail in implicit ways or add fields that were removed in later versions of the mod. Not every class needs to be serialized, which makes this easier than it sounds for class objects. There are maybe only a couple dozen of them that would need to be refactored to support this.
We could make the class name unique on a per-mod basis, ignoring the file path, and prefix the mod identifier to the name.
But now we have to handle a class being renamed, while still referring to the same class object. I think the name has to be a string constant then. That constant would never change no matter how many times the file is moved or renamed.
Okay, how about this for map objects.
Whenever you do something like Chara.player()
or Item.iter()
, all you would ever get back would be a proxy to the actual underlying map object(s). Serializer would detect this and be able to convert the weak reference to location/uid. Right now we pass references to the map object itself from the pools, the exact same ones the pools hold on to. With this there would only be weak references saved unless you mess with the metatables or something.
But performance takes a hit if we allocate too much. We can have the weak refs allocated up front so the pool can just hand them out whenever it wants.
I think by now it's clear that all the issues surrounding "implicit" serialization (take a table, iterate through the whole thing, and save anything you find) far outweigh the costs of writing explicit serialization code (specifying the proper schema of the object). binser
is turning out to be too generic for our use cases, when established paradigms like map objects and our specific implementation of OOP are already widely used at this point.
I mean, we have to consider the tradeoff of stability versus what benefits implicit serialization give:
But those points are pretty bunk. 1) is an Achilles' heel when it comes to making the engine stable at marginal benefit to just myself, and only applies while the schema is being changed frequently as the port is being worked on. 2) is merely a compromise to make up for the lack of a stable schema. Why should it even be possible to forget the new fields we added for map objects? And given how hard it is to keep location
in sync with the current serializer even now, it isn't that much of a guarantee.
Performance will also be terrible from blindly traversing the entire table more than once looking for things to convert to weak references and back.
And when it comes to 1.22 compatibility, there will be a point when all the base map object structs are worked out and we will not have to add more fields to them. But at that point we'd be left with no schema and also no need to change the list of underlying fields in any radical way. Vanilla is vanilla, and that will not change no matter what mods will add on top - otherwise we wouldn't be an engine rewrite of Elona, we'd be something else entirely. (Praise be to Jure that taking a good game and rewriting it while explicitly aiming for compatibility means that designing the game to actually be fun is a solved problem.) So since we wouldn't have to touch the base engine's save format that much after the port is more or less complete, we'd be left with none of the benefits of implicit serialization (ability to save everything without having to worry about forgetting to add new fields) and significant drawbacks (no stablility in the save format, difficult to tell what fields are valid for each map object, major performance implications).
I think my main aversion to schematizing is primarily due to the fact that the fallbacks
table for map objects is so incomplete and disorganized. With better organization it would be much easier to code the serializer explicitly for every field. We should just rewrite most of the data:add_type {}
code to be cleaner, and explicitly state what gets serialized. Then drop the stock version of binser
and write our own serializer, because at this point the specific needs of the project are much clearer than in the past, and knowing those needs we can create a solution that fits them better. We should never have to take some data where we know what we're expecting to save/load and "hope" that it works out by punting it through an algorithm that was designed to be generic to any kind of data.
I'm leaning towards NBT as the serialization format. If it works for Minecraft and all its mods, maybe it would work for OpenNefia as well. Plus, we'd be able to inspect the data with a wide array of existing tools for debugging purposes.
One other thing: with a stable schema in place, we can consider converting all the map objects to "struct"-like objects where accessing an unknown property will throw an error. That would make the serialization and stability improvements come full circle.
We won't have to worry about mods because this only applies to the fields the base engine adds. Those are the ones we're expecting to almost never have to change past a certain point. The fields that mods add would be siloed off in the ModExtTable
. We should also consider how to schematize these mod-added fields per map object, and if we want to also enable property checking for those. It would be best to have consistency between the engine and mods here.
It's an open question as to whether or not this would do some kind of rudimentary typechecking. The performance hit would have to be measured if we wanted to try adding it. We might have a toggle that changes how the struct metatable behaves for each property - a blind check for inclusion in a set of property names, or an explicit type check for each property value. However, because of the way __newindex
works, this wouldn't be able to typecheck nested table mutations - only primitive values or tables that get set by reference.
Note to self: when we're trying to grab the list of properties we use that are scattered throughout the code, add a "development mode" feature that catches any unserialized fields when saving and prints them out.
If we're making map object structs, also make _id
, _type
, uid
, location
, x
and y
immutable. They're used for internal bookkeeping and should never be changed manually.
The NBT serializer should support serializing map object references by keeping track of the location
s it finds internally, and converting them to references. Same as before, but the location finding can now happen alongside the serialization in a single pass.
We'd need some way for the NBT serializer to discern what class instance a tagged compound should resolve to. Basing this on the class name or file path is not good, as either of those things can easily change.
Then maybe as part of an ISerializable
interface, we should declare some unique identifier - which gets namespaced by the mod that adds the class, of course.
function MyClass:serial_id()
return "MyClass"
end
This must be a string constant, not dynamically generated, as it must remain consistent no matter how many times the file is renamed or moved. Maybe instead of a function:
local MyClass = class.class(ISerializable)
MyClass.__serial_id = "MyClass"
The full serial ID would become my_mod:MyClass
.
I think it's reasonable to say that this will break if you rename the mod's identifier. Unholy amounts of other things would break if you did that and tried to load an existing save anyway.
We could ditch passing __name
to class.class()
also and instead use debug.getinfo()
to set it to the file's require path, for debugging purposes only.
It turns out that this was harder to implement than I originally imagined, mainly because of table references. If you need to store some unstructured table that might potentially store a reference to another table that's used outside of the serpent-serialized string, the reference will become duplicated upon deserialization.
I think requiring that you explicitly implement an ISerializable
interface for anything that needs serializing is a better approach for now.
If a "collect" quest gets generated on a town, it seems like saving and loading it enough times causes a nasty serialization error when loading it again. It has to do with the
location
on the created quest target item not getting unset on serialization.Honestly, the serialization procedure needs a cleanup in general. We should have a simple algorithm for pointer swizzling the
location
and similar backreferences. This has been floating in around my mind for a few days as of late:field.map
gets saved, traverse it down and note any fields that act aslocation
.location
. One of two things can happen: a. The location was found in the first step. In this case, swizzle the pointer to a lazy reference of(object UID, location identifier)
. b. The location was not found among everything underfield.map
. This indicates that the object's reference lies outside the map, and should be removed. The field on the table containing this object is set tonil
.pool
s of theILocation
s found. Now we can serialize the entire thing to disk.When loading:
binser
'd blob, reassign object and class metatables.ILocation
s.(object UID, location identifier)
pointers and replace them with a reference to the actual object.If this works, it means we can do things like set references to map objects like for
IMef.origin
,IChara.item_to_use
orIItem.chara_using
and have the serialization story work out for those, instead of needing to do the UID mangling ourselves. Arguably the latter is safer, but is less ergonomic to work with.My only concern is performance. If this gets too slow then it makes everything way harder as we'd still want to avoid bad
location
references to things lying outsidefield.map
but also need to make sure the other references to live objects are valid. And needing to declare which fields are serializable like this can potentially be error-prone.Regardless, we'll unit test the shit out of this.