Controversial Design Discussion

gjroelofs commented 9 years ago

Hic Sunt Dracones

After going through core a couple of times over the past few days I have some (very) controversial questions. I wanted to add this issue to contain the discussion and to make sure that any comments on my part are not misinterpreted. (I love Artemis and all the effort put into it, so I hope this issue won't be interpreted as anything else than an attempt to gain insight and open discussion.)

Also, I could of course just be plain wrong. :-) (Especially with regards to the Unbearable Lightness of Entity, which lies at the heart of 95% of the questions)

Flyweight & Entity ID propagation

The almost required instantiation of Flyweight in BaseSystem / EntityManager, TagManager, GroupManager, etc to me feels like an optimization pass leaking into the API. It complicates the lifecycle, and adds added setup to use proper API. Providing abstractions over these to accomodate for this requirement will in turn

I want to propose moving back to an Entity based API for add/remove etc in the SubscriptionListener / EntityObserver. The flyweight creation can then be moved to the original caller location and its lifecycle properly contained.

Essentially: I don't fully understand the move to operations on EntityID ints as opposed to Entity instances. To me the API would become a lot cleaner, and memory optimizations (ref vs int; cache hitting; array copy performance) are negated by the cpu overhead introduced (lookup, transformation) and the convolution of API.

EntityEdit

As far as I understand it now, the purpose of EntityEdit is to maintain a composition(id) and provide batch operations as to combine operations that alter the compositions id in an effort to reduce changes to the composition id.

It makes normal (un-edited) operations on Entities slower by the underlying bureaucracy and added abstraction.
The memory benefit of maintaining a separate EnityEdit Pool which is lazily tied to an Entity and re-used is negated for many use-cases that actually require the memory benefit.
E.g.: I think that in most worlds with many entities (1.000.000+), most entities are of a similar family (e.g.: particles) and would most likely all need to be updated/synced on changes within a single process to a system. This in turn queues/instantiates the number of EntityEdits of the number of entities that are changed. Even worse; if the system is truly memory constrained a run-time operation which happens to trigger to many update operations could put the system over the top. The use-cases where we are not dealing with many entities will not benefit that much from abstracting over the memory optimization.

Inlining the EntityEdit primary variables into Entity (hasBeenAddedToWorld, componentBits, scheduledDeletion) would have about a 20MB performance impact on a system with 1.000.000 entities. (Which doesn't include the gained memory benefits of removing the pooling and class)

World Bitset changed/added

I assume that the benefit of this are in terms of memory. However, operations such as cardinality() and the requirement to lookup all indices actually set have a CPU performance impact on the core process code.

DaanVanYperen commented 9 years ago

Flyweight & Entity ID propagation

We've always opted for performance over simplicity, some of which could use a bit of selective retcon imo. Agree on usability issues due to the convolution.

I'd propose to at least measure the assumed negated benefit. If we keep the flyweights i'd like to see it pushed back into its innards, manage flyweights for the users for simplicity sake.

EntityEdit

1m entities won't perform on memory constrained platforms anyway. You'd black box that in a specialized system. Typical use case on those (android,gwt,ios) platforms is a couple hundred to thousands of entities.

Large volume composition changes are performed with EntityTransmuter, which I believe circumvents edit completely? not certain.

Assumption here is the indirection and footprint cost is significant. Haven't looked at code recently, but would assume we have 1 flyweight per system, and only as many EntityEdit as we have peak operations in a process. Smaller footprint than you propose? Typically entities change composition infrequently compared to state. (I guess, if the dev doesn't take care to spread out particle creation, could lead to 1m EntityEdit instances).

Beyond that I seem to remember something about a locality benefit of a small entity. @junkdog?

World Bitset

Got to measure if it is significant or not.

junkdog commented 9 years ago

I'll chime in after work or tomorrow. (Or after lunch).

gjroelofs commented 9 years ago

EntityEdit

Apologies, as it was a bit unclear, the showcase of 1M entities was to show that the memory performance impact is minimal even at ridiculous numbers. 20Mb in an extreme use case, even on Android (with the extension of A3.0 of allowing an app to request more Heap), is little to nothing nowadays. In any normal use case, the memory impact of the optimization is negligible compared to the overhead introduced by the Components.

If from a footprint perspective the cost can be discarded, the inlining will give some (even minimal) increased computation performance and improve the API then I see no arguments against inlining the optimizations that EntityEdit provides.

World Bitset

Again, the same argument. If memory is no large concern / the impact can be discarded, it increases (even minimal) performance, and improves the API, then there is little reason to keep it in.

Essentially, it boils down to: If the performance is same, or minimally different (arguably better even), then the preference should be given to API ease of use/cleanliness? But maybe even more importantly: I'm not yet seeing the performance improvements addressed by these implementations, maybe someone could answer these?

junkdog commented 9 years ago

quick about flyweights

It's mostly about attempting to reduce cache misses: having all the ids contiguous and more compact allows up to 16 entity ids per cache line, while Bag<Entity> gives us half the number of references per read with an indirection to Entity (24 bytes: header 12, world 8, int 4). Even if they're sequentially laid out, that is less than 3 entities per cache line (type alignment puts the int at the end of Entity too).

I agree that it's something smelly though.

junkdog commented 9 years ago

I'll chime in after work or tomorrow. (Or after lunch).

I feel tricked.

DaanVanYperen commented 9 years ago

Apologies, as it was a bit unclear, the showcase of 1M entities was to show that the memory performance impact is minimal even at ridiculous numbers.

Re-reading it I see what you meant.

Essentially, it boils down to: If the performance is same, or minimally different (arguably better even), then the preference should be given to API ease of use/cleanliness?

I agree. That's the path we prefer in most cases, Sensibly benchmarking specific features is non trivial however, and integrated benchmarks do not tell us much. Also, some of these changes have been WIP for a couple of versions, working towards a final version (and payoff).

I'm all for trying out some things, and improving things as part of 1.0!

But maybe even more importantly: I'm not yet seeing the performance improvements addressed by these implementations, maybe someone could answer these?

But the bars. on the graph. think of the bars!

junkdog commented 9 years ago

Flyweight & Entity ID propagation

I want to propose moving back to an Entity based API for add/remove etc in the SubscriptionListener / EntityObserver.

These are easy to roll yourself though, when I need convenience, I use:

public abstract class SubscriptionAdapter implements EntitySubscription.SubscriptionListener {
    private final World world;

    public SubscriptionAdapter(World world) {
        this.world = world;
    }

    @Override
    public final void inserted(IntBag entities) {
        int[] ids = entities.getData();
        for (int i = 0; i < entities.size(); i++)
            inserted(world.getEntity(ids[i]));
    }

    protected abstract void inserted(Entity e);

    @Override
    public final void removed(IntBag entities) {
        int[] ids = entities.getData();
        for (int i = 0; i < entities.size(); i++)
            removed(world.getEntity(ids[i]));
    }

    protected void removed(Entity e) {}
}

I'd propose to at least measure the assumed negated benefit. If we keep the flyweights i'd like to see it pushed back into its innards, manage flyweights for the users for simplicity sake.

0.9.0 vs 0.8.0, basically - but 0.9.0 added some extra fluff too, iirc. 0.9.0 was the first time the iteration benchmarks scaled linearly with entityCount.

If I rewrote an ECS from scratch today, I'd go with Adam Martin's plain int:s for id:s, even if java doesn't really have the language level support for expressing it in a type-safe manner. They're just too central to the ECS architecture to waste cycles on (if you're going for raw performance/keeping the ECS's active footprint-per-system as small as possible). Ironically, the flyweight approach is more error prone than plain int:s, though I hope artemis-odb-debug remeidies that somewhat.

EntityEdit

Aside from being re-used, EntityEdit isn't about conserving memory, but - as you also point out - to batch entity operations. They also ensure that only one event of one type is propagated (until recently, it was possible to trigger multiple events per state tick); housekeeping must happen somewhere.

Additionally - batching minimizes changes to compositionId, and therefore don't risk a trigger-short existence in EntitySubscriptions where they don't belong.

(I guess, if the dev doesn't take care to spread out particle creation, could lead to 1m EntityEdit instances).

No, EntityEdit instances would be re-used (EntityEdit#obtainEditor)

It makes normal (un-edited) operations on Entities slower by the underlying bureaucracy and added abstraction.

Only marginally - the overall net effect is however much faster + solves housekeeping w.r.t. added/changed/deleted.

Inlining the EntityEdit primary variables into Entity

This would thrash the caches: part of making the iteration benchmarks faster has been reducing the runtime size of Entity. In user code, this means caches aren't evicted as early as they used to be - active footprint per system can be somewhat larger without taking a noticeable hit.

World Bitset changed/added

I assume that the benefit of this are in terms of memory.

~~Performance, primarily~~ - I was going to write, we're apparently not doing any lookups into these anymore. They could favorably be replaced with IntBags.

junkdog commented 9 years ago

World Bitset changed/added

not doing any lookups...

Since we sometimes set the bits multiple times (multiple transmuters or entity edit + transmuter cause this to happen): the alternative is to filter duplicates or live with multiple events per entity.

gjroelofs commented 9 years ago

@junkdog Thanks for doing the write up, the patience in the discussion and further more even integrating some of the comments already into core. ( 704d4a2fec28fc88afbc4c65b57b3d9e5add6184 )

I'll go through the code a couple times more to better understand the implications.

Entity int-as-base

I would be interested in seeing how much of a performance increase this lead to. My assumption would be that the cache misses introduced by the Components would far outweigh the cache misses introduced by the Entity IDs. (Along with the added bureaucracy which indirectly maintains the state normally residing in Entity) Not to mention the calculation done in the actual systems to far outweigh the overhead of Artemis.

Is it possible to see the performance graphs at specific points where these features were introduced?

World Bitset changed/added

It would be interesting to maintain the Bitset for the changed duplicate check and thus avoiding multiple calls, and maintain an IntBag for the lookup. (We ourselves solve this in the Entity class itself) The overhead of the nextInt() in Bitsets required to iterate over the entities is going to be noticeable over a set of a bit & add to a bag.

Aside

If you ever program in Kotlin / Xtend it'll be interesting to look at extension methods on int:s that provide the Entity operations. Both languages allow inlining of functions which could considerably improve performance on platforms without proper JIT. (Even more so when we introduce the inheritance in the DSL)

junkdog commented 9 years ago

I would be interested in seeing how much of a performance increase this lead to.

It's primarily about keeping the footprint small. As with vs heap-splattered components... that's what @PackedWeaver was supposed to address, but perf gains are eaten up by polymorphic dispatch on PackedComponent#forEntity and virtual field access.

Pulling numbers out of my ass; I guess components probably eat up 3-5x more of the cake. Components are harder to fix, but overall - I want artemis to contribute as little as possible to overall overhead. PackedComponents are one that list too.

Is it possible to see the performance graphs at specific points where these features were introduced?

You'd need to cross-reference existing benchmarks with the changelog. Each individual feature is benchmarked during dev, but not always.

I haven't run the benchmarks against JMH's recent-ish cache miss reporter, but leading up to 0.9.0, iteration benchmarks did not scale linearly with entity counts. I infer that is because we were hitting main RAM a lot more in earlier versions (same can be said about the other frameworks too).

Along with the added bureaucracy which indirectly maintains the state normally residing in Entity

We seldom need to access this state though - only during changes to tne entity's lifecycle or composition changes. Normal entity processing would suffer.

It would be interesting to maintain the Bitset for the changed duplicate check and thus avoiding multiple calls, and maintain an IntBag for the lookup.

By-value lookup doesn't scale with higher entity coubts. In the benchmarks, that's basically what the "add/remove component" benchmarks are showing. Or what are you suggesting?

Ahhh... cool. I think I just got it. I'll try it. brb.

junkdog commented 9 years ago

Ahhh... cool. I think I just got it. I'll try it. brb.

It'll take a bit of refactoring: we'd lose sequential id:s on inserted/removed (it'll be transmuted followed by edited). I wonder how much of an effect it'll have though, we're talking about at most 3 bitsets to 3 intbags per tick.

If I'm understanding you correctly:

replace World's bitsets with 3 IntBags (this could be improved further, but it requires refactoring EntitySubscriptions)
keep a bitset for changed ids.

Additionallly, the impact of lookup will be problematic.

I'll run some benchmarks later (not today). Whatever the case, this is something that can be changed without affecting public API.

junkdog commented 9 years ago

Entity int-as-base

Hmm, evil experiments and int-as-base sparked a thought and a couple of commits:

IteratingSystem - like EntityProcessingSystem - but #process(Entity) replaced by #process(int entityId)
- The same could be done for the other standard systems too.
- Ashley's naming conventions bests artemis'.
internally pass around more int:s and less Entity instances (public API unchanged, save for a couple of int-overloaded methods).
added world.edit(entityId)

Taking it a step further:

All existing Entity systems iterate over Bag<Entity> instead of IntBag
- It's up to the systems to resolve the Entities when the subscription is updated
- Thereby removing the need for flyweight instances
- With flyweights gone, artemis-odb-debug can be scrapped, and escaping flyweights are a thing of the past, while also removing some internal checks.
new systems should be weave-fitted with "static polymorphism", at least IteratingSystem.

This way, Entities still exist - via the old systems - but int-based systems are provided alongside. Existing Entitiy*Systems could be migrated to another artifact or continue living alongside the new systems. EntitySystem could also have their API retrofitted with insert(Entity)/removed(Entitiy) again (breaking change, but small-ish - we're taking a few hits with 1.0.0 anyway).

Entity*Systems would take a small perf hit, but act more reliably w/o the flyweight. New int-based systems would perform - hopefully - slightly faster than existing ones, and work with the more "raw/natural" data directly.

Thoughts?

junkdog commented 9 years ago

If you ever program in Kotlin

If there's a a way around their silly bitwise operators - or lack thereof...

Namek commented 9 years ago

Entity int-as-base

@junkdog What about storing a reference to Entity? Would it make any sense?

The only convenience (I remember) of Entity over int:s is that I don't need a reference to World or any ComponentMapper to get entity's components outside of systems and managers.

junkdog commented 9 years ago

What about storing a reference to Entity?

Where/what context?

The only convenience (I remember) of Entity over int:s

We'd provide both ways: old-style with no flyweights, and new-style: pure ints. old-style takes a slight perf hit, new style would be faster.

snorrees commented 9 years ago

Thoughts?

I really like this approach. This means the Entity-class could be opt-in, and stuff like DSL-entity could be plug-in based, by providing DSL-entity-based superclass-systems.

junkdog commented 9 years ago

This means the Entity-class could be opt-in

Yeah, Entity would be an optional level of abstraction, basically.

snorrees commented 9 years ago

What about storing a reference to Entity? Would it make any sense?

I would imagine you still have to manage entity-references like now, replacing Entity with int, maintain some sort of wrapper with id + uuid/tag, or manage using an IdentityManager.

snorrees commented 9 years ago

@gjroelofs I will be interesting to hear how entity-as-int fits into your vision, which uses Entity to encapsulate a lot of convenience.

In my mind, this could remove the need to resolve entity using classpath ticks, as long as the the plugin provides systems which uses DSLEntity.

DaanVanYperen commented 9 years ago

@junkdog best of both worlds. Artemis purists get what they want, we get performance and ints in core, no more int/entity mingling, and no more flyweights!

gjroelofs commented 9 years ago

To address the DSL question; essentially the API can be split up in Accessors and Mutators (add/remove). The simplest approach I can think of would then be to generate a class containing static functions for Components given an id.

import static com.artemis.EntityDSL;

position(id)
addPosition(id, 0, 0)
removePosition(id)

The primary downside of this being that you can't abstract over this without an instance, in which case you end up with syntax similar to ComponentMapper. I.e.: it's not possible to provide a baseline API that allows to work on multiple Worlds without factoring out ComponentType, which requires determining the Component Family IDs at compile time. This is do-able, but requires other unwanted changes.)

Component Type

Not too serious, just a thought. We might then just as well address the next performance elephant in the room with ComponentType. If we know the number of Components per World at compile time, then we can generate the Component Family IDs prehand.

Entity-as-ID

There are several problems/inconvenience we now introduce with regards to using entities outside System environments.

DaanVanYperen commented 9 years ago

There are several problems/inconvenience we now introduce with regards to using entities outside System environments.

In exchange for the increase in API clarity, and options to fall back to entity, core int really isn't going to be a significant worry.

gjroelofs commented 9 years ago

Ah, don't get me wrong. I'm not against the idea on the premise we allow the convenient approach as well in the API.

However, we can still be critical and acknowledge the downsides. The Entity-as-ID introduces a (small) overhead for most of the users of Artemis who will want the Entity usage. (If we introduce the EntityDSL, we stimulate people to use Entity as a base.)

There is also the question of how much of the optimization choices made (EntityEdit / Transmute / etc) facilitating Entity-as-id, which hinge upon the understanding that mutations of entities don't occur as much as iteration does, are true in an environment where we want to more embrace the ECS principle. (Removal enabled/disabled, faux-immutable components which we can react to with replaceComponent, etc).

Namek commented 9 years ago

Entity-as-ID

@junkdog I meant saving a reference to another entity, e.g. saving Entity in some component. Is it safe in your concept? Just making sure, because it sounds too great to have both pure int:s and sometimes Entity.

junkdog commented 9 years ago

I would imagine you still have to manage entity-references like now, replacing Entity with int, maintain some sort of wrapper with id + uuid/tag, or manage using an IdentityManager.

Yes, but using Entity vs int doesn't exclude the other option. You can mix freely.

UuidEntityManager should be scrapped (maybe we should move it to contrib?), identifying entities by UUID:s is too slow to be practical. A normal int or long should suffice.

Existing managers should accepts entityId too. Not sure about PlayerManager and TeamManager, but TagManager and GroupManager should def work with int:s.

We might then just as well address the next performance elephant in the room with ComponentType.

It's a map, so I'm inclined to like removing it, but many cases already bypass it: component mappers are pre-resolved, archetypes and transmuters skip this step too.

But, ofc, if you want to issue the request every tick (by, for example, issueing create/remove component very-very frequently), the performance gains may amount to something. I'd say the most straigh-forward approach would be to open EntityEdit#create(ComponentType), and then at compile-time:

generate a class which holds references to each ComponentType.
redirect all explicit #create(Class<Component>) to #create(ComponentType), via our generated class - same goes for #remove.

Benefit of this approach is that it woudln't stop working when weaving is disabled. It probably doesn't catch all use-cases though; anything involving reflection wouldn't work. Not sure how big of a deal that is.

Otoh, Archetypes, ComponentMapper and EntityTransmuter already bypass ComponentType - why don't they fit?

In any event, it'd be interesting to know what kind performance benefits come with removing the need for the ComponentType map.

There is also the question of how much of the optimization choices made (EntityEdit / Transmute / etc) facilitating Entity-as-id, which hinge upon the understanding that mutations of entities don't occur as much as iteration does

replaceComponent wouldn't change entity composition - it remains the same, even if the instance is replaced ("replaced" here is conceptual, the two instances would stay the same).

saving Entity in some component. Is it safe in your concept?

Yeah, same as now: @EntityId int, Entity, Bag<Entity> and @EntityId IntBag will continue functioning like before.

gjroelofs commented 9 years ago

replaceComponent wouldn't change entity composition - it remains the same, even if the instance is replaced ("replaced" here is conceptual, the two instances would stay the same).

Replace Component semantics could easily be implemented in a way to circumvent this. @DaanVanYperen suggested going the removeComponent / addComponent route, which would lead to these issues (if not implemented as a shortcircuit avoiding the changes).

The main concern is where we move to a ECS design which stimulates add/removing a lot, such as Disabled in https://github.com/junkdog/artemis-odb/issues/371, and designs in which the replaceComponent tactic will be used a lot (Events through components).

It's not that I'm advocating against the changes, but I felt it needed to be acknowledged.

Otoh, Archetypes, ComponentMapper and EntityTransmuter already bypass ComponentType - why don't they fit?

To my understanding we are discussing how to maybe merge these things to simplify the API. :-)

I'd say the most straigh-forward approach would be to open EntityEdit#create(ComponentType), and then at compile-time...

Yeah, this is the approach the DSL generator now takes.

(I might run a test somewhere where I integrate Entity Edit into Entity and see what the performance is along with the EntityDSL; after the entity-as-id refactorization.)

junkdog commented 9 years ago

Replace Component semantics could easily be implemented in a way to circumvent this.

But they probably shouldn't - users are sure to tuck away objects from components into their own data structures and state holders, even if it's discouraged. This is essentially as error-prone as flyweights, i think.

The main concern is where we move to a ECS design which stimulates add/removing a lot

Always room for improvement, but if there's any performance niche artemis fills compared to other ECS:s, it's fast composition changes.

All the existing bureaucracy essentially boils down to:

working within the confines of the old, vanilla code (for the most part, it hasn't been too much pain - other ECS:s have a worse start)
ensuring correctness - this is where most of the checks and resolutions occur.

To my understanding we are discussing how to maybe merge these things to simplify the API. :-)

Several of the performance-guided classes are mandated though, but usage is optional. It should be noted that the performance difference that comes with, say, Archetype and transmuters only increase benchmark throughput by ~25-30%.

(I might run a test somewhere where I integrate Entity Edit into Entity and see what the performance is along with the EntityDSL; after the entity-as-id refactorization.)

Do it!

DaanVanYperen commented 9 years ago

@DaanVanYperen suggested going the removeComponent / addComponent route

I feel quoted out of context!.

The main concern is where we move to a ECS design which stimulates add/removing a lot, such as Disabled in #371, and designs in which the replaceComponent tactic will be used a lot (Events through components).

We already promote generous adding/removing, since all the mechanisms provided by odb make it very cheap. Extreme cases (Events through components, Particles, etc) will always need extra developer care. In practice iteration will still far outweigh composition changes.

To my understanding we are discussing how to maybe merge these things to simplify the API. :-)

We're dancing in a circle of performance assumptions and varied tolerance for (non) idiomatic solutions without setting out use cases. Best if we agree upon some goals and constraints and work from there.

(I might run a test somewhere where I integrate Entity Edit into Entity and see what the performance is along with the EntityDSL; after the entity-as-id refactorization.)

Bonus points if you benchmark the platforms where cycles and hardware will show the most impact. PR into https://github.com/junkdog/entity-system-benchmarks. ;)

DaanVanYperen commented 8 years ago

Closing for grooming. Think we've implemented all agreed upon actionables in 1.0 from this discussion. If you feel actionable issues/requests remain, feel free to re-open as new tickets @gjroelofs @junkdog @Namek!

junkdog / artemis-odb