Add serialization and deserialization mechanisms

solonrice commented 1 year ago

I find myself iterating the entities with an open QueryDescription then serializing all the components on each one, but there should probably be a better (faster?) way. And one that captures only what is needed to rebuild the state. The use-case would be both for syncing around (network or between processes) as well as saving down to disk or database.

Any plans to implement something first class in the near future or any ideas on how to accomplish this that I am missing?

genaray commented 1 year ago

Yep this will be added in the future and its totally possible :)

But it kinda depends on the usecase. Saving on the disk could be pretty easy by writing down the archetypes and chunks as raw bytes and loading them back later.

Custom serialization would still need to be done by the user himself since it mostly depends on his special needs. Probably we could also support some sort of json serialisation.

genaray commented 1 year ago

So im currently planning this feature :)

Do you have any experience with serialisation/deserialisation? Any "wish" API that you would like to see? Like e.g. only serialising whole worlds? Or also single archetypes and chunks ? Or even single entities ? Json? Binary? All of them? ^^

solonrice commented 1 year ago

Main use cases in my mind would would be (1) wire serialization and (2) storage serialization.

For wire serialization, all the standard stuff, but it would be really cool to see some rollback features, with ability to quickly snapshot the world with a diff from the last one. This is so you call rollback the simulation to any time tick, change a component, then playback the simulation with that change.

This could be done maybe this way: Each entity component or world change is tracked with its tick, and a “mini snapshot” made for that entity. Basically you remember the old value of the change. So if changing a component from 1 to 2, change the live value to 2, but store the old value in the snapshot. The snapshot also has the tick number. Some circular buffer can keep the last n snapshots, maybe. This way when a rollback happens you just play all the snapshot old values to get to the tick you want, change the value you need to edit, then run the simulation to the current tick. Inputs would need to be added back in since they are not deterministic from the simulation but not sure that is first class for ecs.

Another cool thing would be integration with the wire. Some easy tagging of components that let a subscriber get all changes to those components so they can be sent across the wire. Not sure exactly how this would look.

Then there is just storage serialization. I don’t mind how DefaultEcs did this, but their interface is a little more complicated than I would have liked.

Anyway, wanted to give you a few quick thoughts from my phone, because I didn’t want to leave you hanging for a well thought out response that I may never get a chance to make!

firesgc commented 1 year ago

Hi!

I would like to be able to load entire worlds as well as merge them quickly (for example, to load parts of an open world level quickly and merge them into the "default" world).

I would like to save a list of entities as a world. Additionally, I would like to be able to quickly create dumps/copies of worlds to make rollbacks or write a debugging tools (for example, by making for one frame world dumps after each system has run, allowing me to easily see which system "corrupts" the data of an entity).

It would be best if these were always pure memory operations and I didn't have to iterate over entities/components and convert them to a binary format. These would be exciting areas of application for me.

Thanks for the awesome project!

genaray commented 1 year ago

@firesgc Thanks for the feedback! :) And glad that you are using Arch!

Well this feature is still being planned, I'm currently writing my bachelor thesis, thus i do not have that much free time currently. But as soon as its finished I'm gonna implement that ^^

(If someone is willing to contribute, i would be glad too...)

genaray commented 1 year ago

@solonrice @firesgc Took a while but Arch now features its first draft of a JSON Serializer.

Its already functional and uses the UTF8Json-Library which is extremely efficient. Theres still room for optimizations and other features however. E.g. It can currently only serialize whole worlds or single entities. A binary one will also follow shortly.

Feel free to check it out, leave some feedback or contribution! :)

hhyyrylainen commented 1 year ago

That looks pretty nice and simple to use. For my use case I'd like to see an easy way to save worlds as part of a bigger object tree. For example I have a game stage class inside which I have an entity system. So for me saving would be easiest to implement if there was an easy way to get a full object tree dump from Arch as a property that can also be set to overwrite the state. Or maybe there is some other easy way to get Arch world state as a part of another JSON object tree?

Note I'm currently using a different ECS because I'm stuck on a mono-based C# runtime which doesn't work with Arch currently, but I'm interested in trying out Arch when we get to use an updated runtime.

genaray commented 1 year ago

That looks pretty nice and simple to use. For my use case I'd like to see an easy way to save worlds as part of a bigger object tree. For example I have a game stage class inside which I have an entity system. So for me saving would be easiest to implement if there was an easy way to get a full object tree dump from Arch as a property that can also be set to overwrite the state. Or maybe there is some other easy way to get Arch world state as a part of another JSON object tree?

Note I'm currently using a different ECS because I'm stuck on a mono-based C# runtime which doesn't work with Arch currently, but I'm interested in trying out Arch when we get to use an updated runtime.

Could you probably give a small example of what your "architecture" looks like? ^^

hhyyrylainen commented 1 year ago

I'm not really sure how I'd go about that, but in brief I have a top level class that represents a save:

class Save{
    public SaveInformation Info {get; set;}

    public string GameVersion {get; set;}

    public GameState MainGameState {get; set;}
}

that gets all serialized as one JSON string and also loaded like that back up.

So then I have in the game state something like:

class GameState{
    public List<IEntity> Entities
    {
        get => pretty complex finding of game entities from current state tree goes here
        set => even more complex save load logic goes here
    }
}

Which allows the entire state of the game to be saved as one JSON object. I've found this useful for writing JSON transformation operations, for example to upgrade older saves.

Instead if I had to use an external serialization mechanism, I'd have one JSON object but one property in it would be a JSON string. That would make the JSON save object much less beautiful and harder to read and to operate on.

With that external serialization approach I'd instead have to have code like this:

class GameState{
    public string Entities
    {
        get => running a second level of JSON serialization from Arch would be needed here
        set => parsing a JSON string back with the Arch serialization mechanics is needed here
    }
}

Which is not exactly what I'd want. If I could get data out of Arch as something like Dictionary<string, List<object>> where the string was some kind of entity identifier (or it could be a long just as well) and the list of objects would have each component for each entity. I'm probably missing something, but with that kind of save / load interface I could easily interface Arch into the existing save and load logic I have setup.

Just to highlight my approach a bit more, this is a cut down example of a save JSON:

{
  "Name": "quick_save_1.thrivesave",
  "Info": {
    "ThriveVersion": "0.5.6.1",
    "Platform": "Linux",
    "Creator": "hhyyrylainen",
    "CreatedAt": "2022-01-14T10:44:44.363433+02:00",
    "Description": "",
    "ID": "2dfba928-8e95-4435-827b-f77061810df1",
    "Type": 2
  },
  "GameState": 1,
  "SavedProperties": {
    "DynamicEntities": [
      {
        "$ref": "2499"
      },
      {
        "$id": "2514",
        "$type": "Microbe, Thrive",
        "Compounds": {
          "usefulCompounds": [
            "glucose",
            "atp",
            "carbondioxide",
            "sunlight",
            "oxygen",
            "iron",
            "nitrogen",
            "ammonia",
            "hydrogensulfide",
            "oxytoxy"
          ],
          "Capacity": 67.5,
          "Compounds": {
            "atp": 66.2459,
            "glucose": 14.2756891,
            "iron": 7.81471157,
            "ammonia": 0,
            "oxytoxy": 16.6867027,
            "phosphates": 0,
            "hydrogensulfide": 0
          }
        },
        ... and a ton more properties and other entities here...
       ]
  }

genaray commented 1 year ago

Well your needs are quite special and thus not really supported by default ^^ However its quite easy to realize for yourself.

var allQuery = new QueryDescription(); // Targets all
world.Query(allQuery, (in Entity entity) =>{
    var listOfComponents = entity.GetAllComponents();
   // Put into dictionary, done
});

Deserialising can be done in a similar way...

var entity = world.Create(types);
entity.AddRange(Span<object>); 
...

hhyyrylainen commented 1 year ago

I did read through the Arch documentation and saw that approach which should be usable for my case. Btw, are Arch entity references serializable or would I need to add an extra mapping layer to my save load to fix up any cross entity references?

That's the kind of fire and forget feature I saw potential in asking for. The ECS library I'm looking at right now seems I'll need to write a custom save interface and also map entity IDs to fresh ones on load as that library's IDs contain a world index in them.

genaray commented 1 year ago

I did read through the Arch documentation and saw that approach which should be usable for my case. Btw, are Arch entity references serializable or would I need to add an extra mapping layer to my save load to fix up any cross entity references?

That's the kind of fire and forget feature I saw potential in asking for. The ECS library I'm looking at right now seems I'll need to write a custom save interface and also map entity IDs to fresh ones on load as that library's IDs contain a world index in them.

They are serializable by default. Basically just two integers :) When you serialize a world using the persistence package you dont need to take care of any mapping since its the whole world with all its entities.

A manual approach however forces you to manage the mapping yourself since the framework simply does not know what you are up to ^^

It basically depends on what you are trying to archieve. E.g. when you load a whole world your own way (the exact way you saved it, no new entities or entities missing, in the exact order)... it will be fine.

hhyyrylainen commented 1 year ago

Well that's good to hear. In the library (which is default ECS if someone was curious) I'm trying currently entities are identified with:

        [FieldOffset(0)]
        internal readonly short Version;

        [FieldOffset(2)]
        internal readonly short WorldId;

        [FieldOffset(4)]
        internal readonly int EntityId;

where worldId is an index into a global list of created worlds. So that obviously will be incorrect after writing and loading data hence needing a custom conversion step to fix up all loaded entity references to point to new valid entities. But good to know for future reference that Arch would work just fine without needing to redo all entity references after a load.

genaray commented 1 year ago

Arch.Extended now features a JSON and a Binary serializer: https://github.com/genaray/Arch.Extended/wiki/Persistence

Using UTF8-Json and Messagepack, so both are very high efficient and can be customized to a great extent. Furthermore there were several DangerousExtensions introduced to modify existing Worlds, Archetypes and Chunks to a great extent. So writing a custom serializer is way easier now.

Serializing Query-Entity-Targets or list of entities will follow in the future :)

genaray / Arch

Add serialization and deserialization mechanisms #46