Discussion: Sparse mapping of Entities to Components

stjahns commented 9 years ago

I like the way this is looking so far! But one thing I'm not totally sure about is having Option pointers on every Entity to every Component type, as I'm not sure how wasteful that could end up being when you have large numbers of entities with tons of different components.

I think a way of having a sparse mapping of entities to their components could be nice. I took a quick at https://github.com/reem/rust-typemap, it seems like it should be possible be used to define Entity as a TypeMap, where component ids could be added with entity.insert::<Id<TransformComponent>>(id) and accessed with entity.find::<Id<TransformComponent>>().unwrap().

There'd be some costs to storing and using the underlying HashMap of TypeIds to the id values, and there's 'unsafe' code under the hood to cast to the proper value type, but it looks like it's wrapped pretty safely. Not sure if this could be worth it or not in the long run, just some food for thought :)

kvark commented 9 years ago

One thing that instantly makes me worry about TypeMap is that it uses heap allocation. My current entities are simple POD structs and are straightforward to work with.

I can think of other ways of reducing the wasted space (like carrying a simple u32 bit mask and transforming it to Options on access) if that's a concern. It's not obvious to me that this is the case though, so I'd prefer to see the actual numbers/statistics first.

Do you know any stats are publicly available? Avg number of components, total components, avg single component size, avg entity size, etc...

stjahns commented 9 years ago

I don't really know any stats, it was just a premature gut feeling I figured I might as well mention. The current design keeps things simple and convenient, I like how you can use pattern matching to select entities with a specific set of components :D

mtsr commented 9 years ago

In a presentation (can't find it right now) by some AAA dev about their migration to an ECS he mentioned using upwards of 150 component types. This happens because once you start using components for game logic, pretty much everything becomes a component. Stuff like CanFly, CanJump, IsAlive, etc etc etc, because you can very quickly find those.

Another way to solve the problem @stjahns mentions is to turn it around. Instead of having Entities be a bag of indexes into the component arrays, have the entity be a single index that is used for each of the component arrays. That's actually why I proposed enabling different types for the array, because some components might be really sparse and be better off as a HashMap and others dense and thus a Vec.

In my experience you may still want one optimisation, but it is a trade-off, which is keeping a BitVec of the assigned components of each Entity. Pretty costly at 150bits per Entity, but it can make filtering Entities by Components quite a bit faster (in some cases, of course).

kvark commented 9 years ago

@mtsr I did consider this approach but didn't find it appealing for the following reason. If an entity has a single ID that indexes components directly, you'd have to choose between one of these:

allocate all components for each entity. Not only this is a bigger waste than Option, but also may reduce cache efficiency due to the unused gaps in cache lines.
turn the component arrays into hash maps. This ruins the continuous data access pattern, thus reducing memory read efficiency.

Do you know how that AAA dev addressed any of these issues?

mtsr commented 9 years ago

@kvark Yeah, that's always the tradeoff. I don't recall what that presentation said.

The way I'd do it is have Vec's for dense components and HashMaps (or VecMaps) for sparse components (and maybe even other data structures such as BTrees depending on benchmarks of the game or whatever). That way you keep cache efficiency where it matters and still save memory where it doesn't.

Note that using your bag-of-components you can still easily mess up cache coherency: If you create components in a different order than the Entities (and you will end up there if you can remove and add components to existing entities) and then iterate over the Entities and lookup the Components for each. This is likely worse than (small) gaps since you're jumping all over the place. Iterating over the Components in that case could solve it, but that won't work in all cases.

HashMap lookups are pretty bad, agreed, but you might be able to iterate over the HashMap instead of doing the lookups, as above. Also we can at least reduce the Hashing to a noop, since we can just cast our indices to u64.

NB Option is the same size as it's type parameter + at least 1 bit (+ alignment), unless the type is a reference type, in which case it uses the nullptr to indicate None. So also quite big if we're talking 1MM entities and 150 components, or whatever.

kvark commented 9 years ago

@mtsr thanks for clarifications, I see your point!

Having an Entity = id and supporting semi-abstract component storage types (Vec vs HashMap) seems to be an option for an entirely separate ECS. Perhaps there is an ECS like that already (@HeroesGrave, @TomBebbington)? If not, we could work on it as a sub-crate of this project.

To my defense, Option can be rewritten to just use a single bit in a mask per entity thus saving 4 bytes per component per entity. Also, while you can screw up the cache by randomly re-assigning components to entities in the middle of the game, it sounds a bit artificial. But of course, if we can enforce proper cache behavior even for bad users - I'm all for it ;)

HeroesGrave commented 9 years ago

@kvark: Yep. That's pretty much what is done in ecs-rs

I think it's the same story in eccles, but that hasn't been updated since 12 Feb.

kvark commented 9 years ago

@HeroesGrave hah, I see the cold/hot distinction in the code, nice!

kvark / simplecs

Discussion: Sparse mapping of Entities to Components #8