Persisting/serializing entities

dasfuu commented 8 years ago

I am currently looking into a good way to persist all my entities. The main use case is saving&loading the game.

Now I could iterate over every entity and check for every possible component and store everything in some way. Then on loading I would have to reverse this process. But at the very best I don't want to do this too "game specific" and would prefer a more generic way of doing this that other people can benefit from. (That also works with a pooled engine and hopefully on all plattforms too!)

At the end I want something like this:

data persistEntities(Engine/PooledEngine)
loadEntities(data, Engine/PooledEngine)

or 

data Engine.persistEntities()
Engine.loadEntities(data)

I think it would be good if one could mark every component that should be persisted with an annotation and/or interface and exclude fields with transient or an annotation. (e.g. a SpriteComponent with TextureRegions)

Maybe allow the components to handle the serialization by itself. (onSave, onLoad - similar to the android bundle concept))

I hope I get some input on this. And I intend to contribute this to Ashley or the community in some way.

metaphore commented 8 years ago

Since entities are not represent whole application state, persisting them is only part of whole app state persistence. Withal systems' internal state, it could be not possible to perform such universal (de)serialization. So I think it's kind of beyond ECS and rather specific case that's why should be handled outside.

junkdog commented 8 years ago

At the end I want something like this: param: Engine/PooledEngine

In my experience, typical usage is to save a subset of entities.

I think it would be good if one could mark every component that should be persisted with an annotation and/or interface and exclude fields with transient or an annotation.

One usually wants to serialize most component types - better to opt-out with an annotation (I'm fond of @Transient), than opt-in for the default use-case.

Dealing with components with fields referencing other entities was one of the bigger hurdles - these relationships must be persisted and properly restored at load.

GWT support is even trickier, but doable - usual GWT procedures apply.

Maybe allow the components to handle the serialization by itself. (onSave, onLoad - similar to the android bundle concept))

As an option, it's probably a good idea - as long as there's a default case which just works.

Since entities are not represent whole application state, persisting them is only part of whole app state persistence.

One can certainly model it in such a way that entities+components are all the state. We have 70+ systems, ~50 component types, spread out over a number iof world/engines - without using any custom de/serialization code. We're also using the same save/load mechanism for "prefabs" (entity blob with some user-defined logic).

dsaltares commented 8 years ago

@junkdog I agree with pretty much what you say.

Occasionally, you want custom serialization/deserialization logic eg. component that holds a reference to a Texture. In this case, the serialized version may want to store the resource path and fetch it from the AssetManager on deserialize.

I'm unsure whether I would like to maintain such system inside Ashley as it's tricky to find a silver bullet. It would certainly help if we provided a suggested approach on the wiki.

dasfuu commented 8 years ago

@junkdog You made very good points and I agree with you. Opt-out is what many serialization librarys do already and it makes sense to match their pattern.

@saltares Assets were one case I had in mind regarding custom logic.

As you said it hasn't to be maintained inside Ashley. But it would certainly be nice if others can benefit from it too. Having something in the wiki and code in some other repository seems like reasonable/good idea.

Nested entities are the biggest hurdle I think. But depending on what libraries are used it could be handled very easily. Gson for example has a GraphAdapterBuilder that handles large datasets very well.(Tested it with ~17k non ashley objects at work. (https://github.com/google/gson/blob/master/extras/src/main/java/com/google/gson/graph/GraphAdapterBuilder.java)
I also experimented with kryo a bit, but it produced stackoverflows because of too many objects i think(But I didn't put much thought and work in it since we would be using gson anyway).

junkdog commented 8 years ago

Occasionally, you want custom serialization/deserialization logic eg.

As long as the underlying serialization framework exposes a means of supplying serializers, or maybe put an interface on the component, it shouldn't be a problem.

component that holds a reference to a Texture. / Assets were one case I had in mind regarding custom logic.

Hmm, what we do is that we have serializable "asset reference" components that go with each unserializable component type (FontReference->FontRenderable, SpineReference->SpineRenderable etc). For each pair, we have a "reactive"/resolver system creating the actual asset - eg, Familiy.all(FontReference.class).exclude(FontRenderable.class) - component is created when processing the entity, thereby also evicting the entity from the system.

This soulotion ends up looking pretty clean - one resolver system per asset type - plus, it saves us from writing custom serialization logic.

Nested entities are the biggest hurdle I think. But depending on what libraries are used it could be handled very easily.

There's some custom serialization logic to cope with ashley's internal state - and the serialization code needs to internally track entity relationships when saving/loading, otherwise this won't work:

public class InheritScale implements Component {
    public Entity target;
}

imo, gson is a bit bloated compared to libgdx's json API. it's one dependency less, too.

dsaltares commented 8 years ago

Using Libgdx's Json API sounds like the natural thing to do since we're already depending on it.

Would this work?

EngineSerializer implementing Json.Serializer<Engine> and EntitySerializer implementing Json.Serializer<Entity>. We may have to do the same for the pooled variants.
@Transient annotation to skip systems and components.
The engine serializer would add elements "entities" and "systems" and would serialize all entities and all systems as long as they're not @Transient.
The entity serializer would serialize all components not marked as @Transient.
The client can customize component, system and listener serialization using json.setSerializer().
The client can deal with resource resolution in serialization.

I'd prefer not to add serialization logic to Engine nor Entity.

I think this supports @junkdog's and @meisterfuu's observations.

json = new Json();
json.setSerializer(Engine.class, new EngineSerializer());
json.setSerializer(Entity.class, new EntitySerializer());
// +user's serializers...

Engine engine = json.fromJson(Engine.class, serializedEngine);
String text = json.toJson(engine, Engine.class);

If we make Engine and Entity implement their own serializers we could make this a lot simpler though.

json = new Json();
// +user's serializers...

Engine engine = json.fromJson(Engine.class, serializedEngine);
String text = json.toJson(engine, Engine.class);

What do you prefer?

dasfuu commented 8 years ago

Using Libgdx's Json API sounds like the natural thing to do since we're already depending on it.

Valid point.

I like the first idea more. It gives more controll and one could do something like this :

json = new Json();
json.setSerializer(Entity.class, new EntitySerializer(existingEngine)); 

Entity entity= json.fromJson(Entity.class, serializedEntity); // or an array

Then the entity can added to an exisiting engine. And it is possible to persist&load only a subset of entities. And it makes it more clear what is going on. But this point is only important for a PooledEngine i think.

dsaltares commented 8 years ago

Actually I missed a couple points @junkdog made.

Which entities to serialize?

We may not want to save all entities, an EngineSerializer would just go and serialize all entities inside an engine...

Do we want to have a SerializeThisEntityComponent and do engine.getEntitiesFor(Family.all(SerializeThisEntityComponent.class).get())?

Serializing systems?

Do we wan to serialize system's state? Ideally, systems should be stateless but people may rely on system's state and may need to persist that too.

Pools

We also need to provide a solution for PooledEngine and PooledEntity. I guess the serializers can just take references to an EntityCreator and ComponentCreator, they would both have create() methods. We would also have implementations for non pooled and pooled entities/components.

So my previous idea could fall a little short on this, but it could be on the right direction.

junkdog commented 8 years ago

The client can customize component, system and listener serialization using json.setSerializer()

And, per default - implementing Json.Serializable would also be an alternative.

Do we wan to serialize system's state? Ideally, systems should be stateless but people may rely on system's state and may need to persist that too.

I've never done it myself, but I do think there are valid, perhaps situational, use-cases for wanting to serialize state not directly bound to entities:

Game metadata or otherwise session-bound state
(pending or previous) Events, notifications
Inventory/Item db

Another good-to-have feature is support for tagging serialized entities with a key. With tags, building prefab-like classes is straight forward: load from json, then retrieve the needed entities and update as required.

dsaltares commented 8 years ago

Thank you guys for the great feedback and the discussion. I'm going to have a go at it, write some unit tests and possibly push it to a branch in this repo and ask for some more feedback then.

dsaltares commented 8 years ago

Another good-to-have feature is support for tagging serialized entities with a key. With tags, building prefab-like classes is straight forward: load from json, then retrieve the needed entities and update as required.

@junkdog how do you see that working? how would you tag entity configurations, specifically?

junkdog commented 8 years ago

We have a SerializationTag component (not to be confused with some global tag id). When loading, the object wrapping the entities also builds key:entity lookup map. I'll see if I can dig up some code which better shows what I mean.

dsaltares commented 8 years ago

Writing an EngineSerializer on top of the branch I created is pretty easy. However, things get a bit hairier in terms of design when we factor PooledEngine in the mix. This is because entity creation (inside the entity serializer) is coupled with the newly created pooled engine (which happens inside the engine serializer read method).

Any ideas to decouple these?

dlux95 commented 8 years ago

We might not need to decouple these. Probably just implement a createEntity() Method inside the Engine which just creates a new Entity() and returns it. PooledEngine just Overrides that method and obtains an Entity from the Pool instead.

That way no matter of the Engine type the creation of the Entity ist the same (Maybe do the same for Components)

dsaltares commented 8 years ago

That would mean the EntitySerializer needs a reference to an Engine, even if there's no pooling whatsoever, which feels awkward. Moreover, we still have the problem of component creation. Do we need another createComponent() method in the base Engine class?

dasfuu commented 8 years ago

Giving the EntitySerializer a reference to a PooledEngine or Engine is the best way I think. It also enables one to add entities to an exisiting engine.

dsaltares commented 8 years ago

@meisterfuu, @Metaphore, @junkdog and @laubed, could you please take a look at the serialization branch?

It's simpler now and it includes an EngineSerializer that works with both Engine and PooledEngine as well as an EntitySerializer.

The tests should make the serialization API self explanatory although I'm happy to answer any questions. I definitely accept suggestions to make it better.

Thanks a lot!

junkdog commented 8 years ago

It's simpler now and it includes an EngineSerializer that works with both Engine and PooledEngine as well as an EntitySerializer.

I took a quick look. I like the design, very clean.

There should probably be a way to load multiple entities at once, without spawning a new engine. Full ~~world~~ engine or single entity feels a bit limiting.

What about persisting entity references in component fields - are those planned for the first release?

dasfuu commented 8 years ago

I agree with @junkdog . But I think handling references in general would be better placed in the json library. Enables multiple components sharing the same objects(not only entities) in a field. (e.g. two players having the same inventory). But I don't know much about the libgdx json classes.(Only used them rarely)

I will do some tests with the current branch later and see how well it integrates with an existing prototype.

dsaltares commented 8 years ago

Thanks for the feedback guys.

There should probably be a way to load multiple entities at once, without spawning a new engine. Full world engine or single entity feels a bit limiting.

You can take a JSON array and call json.readObject() passing Entity.class, which will make the EntitySerializer process them. What kind of API do you suggest to make it easier?

What about persisting entity references in component fields

Maybe have custom component Json.Serializer<T> implementations for that? Users can implement their own entity id system, components should store entity ids rather than references. I think that's the standard practice pretty much everywhere.

junkdog commented 8 years ago

I think many projects gravitate towards more complicated multi-entity compositions. (do we have a term for these, I usually call entities spanning multiple entities for "conceptual entity", but not the best term). In those situations, only loading a single entity at a time may require a lot of manual wiring, and extra overhead.

I think that's the standard practice pretty much everywhere.

We resolve entity references automatically. This helps greatly when dealing with conceptual entities spanning multiple entities. If the serializer ensures entity relationship integrity, one can build pretty sophisticated entity templates with ease.

Users can implement their own entity id system

Couldn't it be internal to the main serializer - if only allowing plain Entity, Bag<Entity> and Array<Entity>? The serializer would assign each entity an internal id (maybe re-use the index, only referenced in the json result), and users have something which works without extra configuration.

mattbl commented 8 years ago

Will there be any further improvement on this branch ? That would be such an interesting feature to be found in the master !

dsaltares commented 8 years ago

I know! I'm finding it hard to find the time. I would really appreciate some help on this.

pererikbergman commented 7 years ago

I'm a bit late into this discussion and haven't read everything here, but I would not serialize the entity system, I would have a model with the whole structure that I can serialize and when building the Engine passing the objects from this model to the entities/systems so that all updates change the objects in the model... works great :)

Lusito commented 7 years ago

Please don't use libgdx's json or xml libraries. There faster and more maintained libraries out there doing a better job, and aside from that, the biggest problem with libgdx's approach is, that it stores the classpath of each class in the json/xml, which bloats the filesize and also makes it really difficult to refactor code.

mgsx-dev commented 7 years ago

@Lusito libgdx's json/xml libraries allow you to customize classTag. IMO these libraries are well maintained. And I didn't noticed performences issues serializing hundreads of entities unless you have some benchmarks about it ... which alternatives would you suggest ?

Lusito commented 7 years ago

@mgsx-dev It's a very hidden option and it doesn't seem to be the recommended way looking at the tutorials. Even the libgdx 3D Particle Editor "Flame" does not make use of the classTag feature.

And looking at this file, I know enough about libgdx's way of conforming to standards: https://github.com/libgdx/libgdx/blob/master/extensions/gdx-tools/assets/uiskin.json It's red all over the place, because it's not valid json.

For json, gson is a good start and for xml xstream is a nice choice. They have proven time and again to be fast (especially with big files), stable and conforming to standards.

mgsx-dev commented 7 years ago

@Lusito To be honest I was thinking exactly the same as you few times ago :D but this weird json format appear to me to be very handy actually but it's just a question of taste and you can use both strong or simple format.

Anyway, IMO matter here is not which serializer to use (it could be abstracted by the API, letting client code make his choice), matter is more on subject like "how to serialize entities cross references", "how to serialize systems", "which entities to save", "which components to save" .... I'm working on it in one of my project and technical serialization is maybe the less important issues I had so far.

libgdx / ashley