discussion - empty components should not be part of the each callback

pgruenbacher commented 5 years ago

skypjack was already refactoring the empty instance pools -> https://github.com/skypjack/entt/commit/e228cb66488fc6ae98365ce2f5f8db45834a92c3#diff-36792e38ee0671f0a02c94d88a91cfbb idk if that is gonna have any impact on this issue...

but basically something like

reg.group<Component, EmptyTag>().each([](auto id, auto comp) {});
reg.group<Component, EmptyTag>().each([](auto comp) {});

instead of

reg.group<Component, EmptyTag>().each([](auto id, auto comp, auto tag) {});
reg.group<Component, EmptyTag>().each([](auto comp, auto tag) {});

since there's no point in having empty instance in the arguments.. This is a api breaking change though so it'd have to be part of a major release cause I'm sure everyone like me has just been including the tags in their each callbacks.

pgruenbacher commented 5 years ago

it wasn't really an issue for me until I started trying to refactor my state machines to use lots of empty tags for organizing agent states to see if there's major benefit with filtering and organization.

skypjack commented 5 years ago

Honestly, I've been thinking to this every day during the last week and I'm not sure yet what would be the best approach in terms of consistency.
If we didn't return them during an each, we shouldn't return temporary objects after an assign nor allow to get them or to receive temporary instances from a signal. However, eg with signal it's useful to receive an useless instance if you want to define templated callbacks that make their reasoning on a type basis.
Therefore I'm still in doubts. Let's see what others say about it. :+1:

joaosausen commented 5 years ago

Isn't it what entt:tag is for?

pgruenbacher commented 5 years ago

entt::tag is an empty component yes, but it's the same issue. entt::tag is just a quick way of declaring an empty component.

skypjack commented 5 years ago

I'm trying to figure out if there exists the possibility to offer a three-ways or four-ways each so as not to break existing code. Moreover, it would be good for consistency with other functionalities (eg signals or the creation with components).
Something like what I did when I've introduced the support for the entity-less version of each.
It doesn't seem much complicated, but I've to try and I'm out for a conference right now. I'll let you know.

indianakernick commented 5 years ago

@skypjack Consider this, if you have a view<EmptyComp, NonEmptyComp> and you pass a lambda with two auto parameters to each, the signature will match entt::entity, NonEmptyComp and EmptyComp, NonEmptyComp.

Existing code will expect EmptyComp, NonEmptyComp. If you keep using an if constexpr then you can keep existing code working but the ambiguity remains.

I'm not really sure if this is a real problem. I just thought I'd mention it.

skypjack commented 5 years ago

@Kerndog73 in this case you can induce an order and probe cases the way you want. Of course, conflicting cases exist but you can know what's going on as long as it's documented. Moreover, remember that is_invocable won't return true when the entity cannot be converted to the component and it's likely to be the case in your example, therefore you've not a conflict probably.

indianakernick commented 5 years ago

@skypjack If you use entt::entity instead of auto then there is no ambiguity and the code is a clearer as well. If you use auto then you might get an error in some rare cases. This isn't really an issue. Just something to mention in the docs.

skypjack commented 5 years ago

Indeed. I agree. I think it's worth documenting the order in which the different alternatives are probed. :+1:

skypjack commented 5 years ago

I'm experimenting with a four-ways each where the evaluation order is:

All components only
Entity + all components
Non empty components only
Entity + non empty components

To be honest, it seems a bit confusing from the point of view of the user.
I'd rather introduce a secondary function (walk?) that works exactly as each does but for the fact that it doesn't return empty components.

What do you think about?

indianakernick commented 5 years ago

@skypjack If you’re thinking about splitting it into two functions then maybe you could have have each (which handles 1 and 2 same as before) and each_nonempty (which handles 3 and 4). I’m not sure about the name each_nonempty though.

On second thought, ignoring the empty components is what you want most of the time while getting the instance is an edge case (for calling methods on it or decltype). The edge case should have the longer name so each could handle 3 and 4 while each_empty could handle 1 and 2. This is a breaking change though.

skypjack commented 5 years ago

It makes a lot of sense actually.
So, probably, the four-ways should be more like:

Non empty components only
Entity + non empty components
All components only
Entity + all components

Splitting could be good as well but everybody uses each right now. I cannot really change the name for the non-empty case without fully breaking all existing code.

indianakernick commented 5 years ago

@skypjack The new evaluation order might break some existing code (in the way that I described about 21 hours ago).

indianakernick commented 5 years ago

If one of the components is empty and the existing code asks for all components then they’ll be given the entity + non-empty components

skypjack commented 5 years ago

Indeed. Apparently the choice is between breaking things or introducing an extra function for that.

trashuj commented 5 years ago

If the signature is clean i dont see why it would be confusing. (I'm thinking about the meta static functions that does compile but requires a {}, that is confusing :)

On Sun, May 12, 2019, 13:04 Michele Caini notifications@github.com wrote:

Indeed. Apparently the choice is between breaking things or introducing an extra function for that.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/skypjack/entt/issues/243#issuecomment-491585618, or mute the thread https://github.com/notifications/unsubscribe-auth/ACF5UVQAQMIROXH2A54ML5DPU72UHANCNFSM4HMFEMLA .

pgruenbacher commented 5 years ago

I suppose we could introduce a for_each method for this new signature callback order. I'd also suggest having it expect a boolean return for early exiting?

indianakernick commented 5 years ago

+1 for early exit

indianakernick commented 5 years ago

@skypjack Is a breaking change really that bad? I mean, the compiler will show you all the systems that have problems with each and all you'd have to do is delete auto. People using the master branch should expect the occasional breaking change anyway.

skypjack commented 5 years ago

@Kerndog73 good point. You're definitely right.
About early exit, I would make it optional at least. If you don't need it (that is the common case when you iterate everything), it's just a way to waste cpu cycles and probably probe your branch predictor to an extent.

pgruenbacher commented 5 years ago

very true in making the early exit optional, it keeps the callback cleaner too instead of doing a "return false" at the end of every callback that doesn't care about early exit.

skypjack commented 5 years ago

What about the reference received by a signal?
Currently, the signature is something like void receive(registry &, entity, Component &);. The last parameter is pointless as well in case of empty types and it's probably misleading.
What do you guys think about it?

pgruenbacher commented 5 years ago

Yea agreed on no empty component callback predicated on if we can make the error messages more intuitive.

On Tue, May 14, 2019, 5:43 AM Michele Caini notifications@github.com wrote:

What about the reference received by a signal? Currently, the signature is something like void receive(registry &, entity, Component &);. The last parameter is pointless as well in case of empty types and it's probably misleading. What do you guys think about it?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/skypjack/entt/issues/243?email_source=notifications&email_token=AAZNT5UA2RMYT3PHXLKJBSLPVKCVHA5CNFSM4HMFEMLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVK55MA#issuecomment-492166832, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZNT5TAGEDLW6VXTPQBUL3PVKCVHANCNFSM4HMFEMLA .

skypjack commented 5 years ago

Ok, I made some tests on the view and I must admit that @Kerndog73 is definitely right: it breaks everything. :smile:
Conflicting cases aren't so rare. In my own code I use auto && almost everywhere with each. What's worse is that errors are subtle, no matter what's the order of evaluation.
It seems that the best bet is to define a different function with a different name for this.

Any suggestion for the name?

pgruenbacher commented 5 years ago

For_each ?

On Wed, May 15, 2019, 6:20 PM Michele Caini notifications@github.com wrote:

Ok, I made some tests on the view and I must admit that @Kerndog73 https://github.com/Kerndog73 is definitely right: it breaks everything. 😄 Conflicting cases aren't so rare. In my own code I use auto && almost everywhere with each. What's worse is that errors are subtle, no matter what's the order of evaluation. It seems that the best bet is to define a different function with a different name for this.

Any suggestion for the name?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/skypjack/entt/issues/243?email_source=notifications&email_token=AAZNT5X3B7OEJU3U6MFBRRDPVSEEDA5CNFSM4HMFEMLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVQDQWA#issuecomment-492845144, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZNT5Q2UKGHP373YCGLKSDPVSEEDANCNFSM4HMFEMLA .

skypjack commented 5 years ago

each and for_each? It could make sense. Naming things is really hard! :cry:

indianakernick commented 5 years ago

each_nonempty maybe? To minimise confusion and ambiguity, the new function should probably never return empty components.

skypjack commented 5 years ago

Yep, that's exactly the point. Two versions to iterate, one returns all the components, one returns only non-empty ones.

pgruenbacher commented 5 years ago

braindump: "iteritems", "handle", "each_only", "traverse", "step", "iterate", "pick_each", "pluck_each", "each_filter", "for_distinct", "for_filtered", "each_filtered" ... maybe the Europeans have some good words?

skypjack commented 5 years ago

I like traverse. Also visit could work.

pgruenbacher commented 5 years ago

for those following the conversation the new api is each for legacy each and less for non-empty types. I'm down with that since less name is succint.

skypjack commented 5 years ago

Ahahahah no, it's a placeholder name, nothing definitive. We have still to decide it. :wink:

pgruenbacher commented 5 years ago

:man_facepalming:

skypjack commented 5 years ago

I guess you don't like it much... :smile: :joy:

pgruenbacher commented 5 years ago

some more names that are short "every", "tutto", "cada"

skypjack commented 5 years ago

Ok, I've merged it on master. The name is temporary until we cut a new version (most likely v3.1). Let's see if someone pops out with a good name for this function in the meantime, I'm no particularly proud of the current one. :+1:

OvermindDL1 commented 5 years ago

I'm liking each_nonempty personally, or just a 4-way each.

skypjack commented 5 years ago

@OvermindDL1 not regarding this issue, but I get the chance because you're here (I wish GitHub had direct messages): were you the one that told me he's working on specs?

OvermindDL1 commented 5 years ago

(I wish GitHub had direct messages)

It used to, wish it didn't remove it... >.>

Always feel free to ping me though!

were you the one that told me he's working on specs?

I influenced its design from my old old old C++ ECS design (table style, concurrency support built in, etc... etc...), though I did not write the code myself (still learning rust, I'm an old C++'er). The people that wrote it exist on the Amythest Rust Engine Discord though if you want to speak with them?

skypjack commented 5 years ago

@OvermindDL1

Always feel free to ping me though!

Thank you very much! Really appreciated.

I influenced its design from my old old old C++ ECS design (table style, concurrency support built in, etc... etc...)

I'm sketching the concurrency support for EnTT. I decided to experiment with something new (as always, being EnTT an experimental project) and I think the thing that is closest to what I've in mind is what I've seen in Rust.
If you inspired it, then your the person with which I'd like to share my thoughts. Even though the two solutions aren't the same, probably you can give me some advices on the challenges I'll encounter down the road with my project.

OvermindDL1 commented 5 years ago

It's not a particularly hard setup. I had a dependency tree built out of the systems based on what components they accessed, write meant only a single system was allowed to access it at a time, read-only meant multiple readers could at once, some systems weren't allowed to be concurrent with anything else (or only specific other things as they performed internal multi-threading instead (physics for example). Some systems that 'write' to things were still only 'read-only' because they built up a set of actions to perform that were run after all concurrency was complete (think of it as the 'reduce' step in map-reduce), which was useful for resolving potentially conflicting updates (multiple entities accessing one to do conflicting things for example), though those transactional updates could be multi-threaded across each other as well once they were unified. I just used a simple workqueue to push tasks on for the multi-threading that were distributed to other thread with an atomic work-stealing if anything emptied out. The dependency tree was used by a dispatcher to push work distributed among the threads based on the weights that the systems said they needed (like if one system generally was pretty expensive it had a high weight, so it generally was pushed to a thread by itself, though work stealing could always come into play). My engine was mostly just a sandbox that I built up over 15 years, never really made anything in it out of little mini-games and so forth (I tended to make things that had an extreme amount of entities, having literally hundreds of thousands of entities was on the low-end for me and having many many millions was not uncommon, though of course few things updated every tick).

A 'table' (component storage) itself could be as simple as an array, sparse arrays, maps, octree's, anything custom, whatever. And of course systems operated over sets of tables with read-only or read-write access on each. The sets were stored in a manner similar to how it looks like groups work in ENTT's latest version (internally it was a sorted tree component types that built up the entire wanted list, and this tree was updated upon component creation/removal, this was the slowest part of my system but still significantly faster than most implementations, and fast enough that I didn't really worry about it as there were relatively few component changes per frame and it only got slower based on the systems wanting to iterate it, so if that number was low enough it was never noticed). Of the component types (tables) of my usual mini-games for the engine I averaged anywhere from a few hundred component types to a couple thousand in a few of them.

Everything was exposed to scripts as well. I used a rather unix'y style path conventions, like keyboards were available at /dev/keyboards, the first keyboard listed on a system (mostly useful on linux as I never was able to determine multiple keyboards on windows without getting admin, which is a bit stupid for a game) was at /dev/keyboards/0, and if you wanted to get, say, when the a button was pushed down you could access that via /dev/keyboards/0/keys/a/down, and on any node in that 'path' you could access properties and information about it, including binding multiple points together, so you could essentially 'symlink' /dev/keyboards/0/keys/a/down to something like /players/0/action/forward along with a transformer (so a 'down' gives a value of 0.0 or 1.0 (every input was a float, buttons where just 0.0 or 1.0 for example) and so would do that with a players 'forward', but you could add a transformer that just took the values and *0.5 with it for example, and you could add another binding to, say, the shift key that multiplied that by 2 or so and did a final combination to the forward). Looking something up along a path was not always the fastest thing (each node could have different performance characteristics, though plain nodes where just simple hashed maps), but you could cache that node NodePtr node = ctx->root["/players/0/action/forward"]; and treat it like any other lookup, that line is equivilant to NodePtr node = ctx->root["/players"]["/0"]["/action"]["/forward"]; or any combination there'of. Even entities were exposed at /entities and component storage at /tables and the current game world at /game/world/current and so forth. Direct access to a node if you knew the type was perfectly efficient, going through the NodePtr indirection was slightly slower but not too bad, and it is how the scripting system interacted with it (I made heavy use of my atom types that I linked in this repo elsewhere for that).

Amethysts ECS is slightly faster than mine on average (mine is faster in some cases, spec's is faster in other, due to different optimizations). ENTT was pretty well faster than mine in general as well (why do you think I'm around ;-) ), which was the first C++ ECS library I've ever ran across that was. :-)

But back to the concurrency/parallelism, I'd definitely look at how amethyst does it, the rust model models it very well and I even ended up fixing a few of my bugs that rust's borrow checker caught. They are in the midst of a slight redesign of some of the ECS stuff based on their experiences of how it's worked so far.

skypjack commented 5 years ago

Wow. Thank you very much for taking your time to write such a detailed reply!
Really, really, really interesting.

I like your approach for input handling, I've never seen something like that before.
Personally, I set up a model where entities announce their interest in specific inputs by means of components to which intents are attached. An intent system looks for matches and triggers these opaque objects, the intents. Someone else, somewhere else, if any, will do something as a consequence.
It's a sort of publisher/subscribers made with components.

I'll look at the implementation offered by amethyst to see if it's any way close to what I've in mind.
Out of curiosity, how did you construct the dependency tree? Just exploring it every time something was added to find the hook point or was it more clever than this?

OvermindDL1 commented 5 years ago

I like your approach for input handling, I've never seen something like that before. Personally, I set up a model where entities announce their interest in specific inputs by means of components to which intents are attached. An intent system looks for matches and triggers these opaque objects, the intents. Someone else, somewhere else, if any, will do something as a consequence. It's a sort of publisher/subscribers made with components.

That's actually similar to mine. I had a system for components so, say, an InputImpulse component could be attached to something, and it was bound to a path (with a non-serialized cached Node entry, well, technically it was serialized just overwritten to be null upon deserialization, I tried to design everything possible to be simple memcpy's for saving out for speed reasons, and consequently saving was super fast), and the system would just listen for events on the given node, aggregate them into a list inside the system, then drain it to perform the physics impulses or dumping inputs into the GUI system or whatever. It was pretty simple and not really the best design, but it worked and was fast enough since inputs didn't happen thousands of times per tick. It was a great action-map system by using node events with transformers and all though. And I cannot state how useful it was for all inputs to be floats, whether a button, absolute axis, relative axis, whatever. A simple transformer could bring anything and everything into the right setup for whatever needed to process it.

I'll look at the implementation offered by amethyst to see if it's any way close to what I've in mind.

Their forums have a lot of discussion though a lot of it is on their Discord.

Out of curiosity, how did you construct the dependency tree? Just exploring it every time something was added to find the hook point or was it more clever than this?

It was pretty trivial and not super efficient, but as it was only done when a system was added to the context (registry in ENTT terms) and then an update was performed it wasn't really that important since it only took a few hundred milliseconds at worst. All it did was this:

Iterate all systems grabbing their aspects (an Aspect was basically a listen endpoint for a set of entities based on component types, so something like Aspect.of<PositionTable, VelocityTable>().exclude<DisabledTable>().build(ctx) linked to or created a set in the context aspect tree of what to map against and thus what should be updated, any of Aspect of the same information had the same list in memory of entities at the same location), and grabbing their concurrency settings (free to run concurrently with others, don't run with XX system, always run after YY system, don't run concurrently with anything, etc...).
Once all aspects were touched then a bi-bitset was created of the same size of the count of all table types referenced (bi being it used 2 bits per table, 0 meant unused, 1 meant read-only, 2 meant read-write, and 3 meant exclude).
The systems were iterated again getting their associated bitsets based on their aspects to get a set of those bitsets.
The bitsets were sorted based on the number of the bi-bits set, then a tree was built based on that order for what was allowed to run together based on the system concurrency setting, it would assert if any systems were required to run before each other (circular failure) and it would fall back to non-concurrent if anything unexpected popped up.

It was not at all efficient and I meant to go over it again. Honestly it should be modeled like, say, a package dependency resolver library, of which is a well researched topic now. It really wasn't that complicated though. If you want it to be 'perfect' then of course it is an NP-Complete problem, but I didn't want perfect, I just found the first set that 'fit' well enough. Probably modeling it as a graph and using Dijkstra's to solve it would be better and more simple actually.

skypjack commented 5 years ago

I'm thinking of removing the empty components from each and get rid of less. Actually, it doesn't make much sense to have fake instances of empty components.

BUT

I want also to add an optional list of components to each, so as to be able to get also empty components if required or even a shorter list of components when one isn't interested in all of them. As an example:

registry.view<A, B, C>().each<A, B>([](auto &&...) { /* ... */ });

It will be faster for obvious reasons and more than once I needed something like this.

What about the two points above?

indianakernick commented 5 years ago

@skypjack My only (petty) concern is that lines could get a bit long. If you're calling view and each on the same line, you'd be repeating types and have to wrap a really long line. Using indicies instead of types could help with the verbosity. If you're using as_view then you probably want to use types instead of indicies so that you don't have to look as far to see what the indicies coorespond to.

skypjack commented 5 years ago

@Kerndog73 good point. However, consider that this is rare, most of the times you'll just use each without a list of components. My concern is instead that the list of parameters for each can be misleading for a newcomer, because it's not immediately obvious if you look at the list of template parameters of a view or a group. That's why I added less aside each initially.

skypjack / entt

discussion - empty components should not be part of the each callback #243