Ukendio / jecs

A fast, portable Entity Component System for Luau
https://ukendio.github.io/jecs/
MIT License
146 stars 26 forks source link

[BUG] Changing an entity during OnRemove silently gets undone when the hook finishes #118

Closed chess123mate closed 1 month ago

chess123mate commented 2 months ago

Reproduction

const world = new World()
const a = world.component()
const b = world.component()
const e = world.entity()
world.set(a, OnRemove, (e: Entity) => {
    world.set(e, b, true)
    print("In remove:", world.get(e, a), world.get(e, b))
})
world.set(e, a, true)
world.remove(e, a)
print("After remove:", world.get(e, a), world.get(e, b))

Actual Behavior

The component 'b' is only attached to the entity 'e' during the OnRemove hook:

In remove: true true
After remove: nil nil

Expected Behavior

The component 'b' should still be attached to the entity 'e' after the OnRemove hook is done:

In remove: true true
After remove: nil true

or:

Ukendio commented 1 month ago

This happens because the destination of where the entity should move is calculated after. So the order of the operations should be swapped. It will break existing behaviour where people rely on getting the data on remove.

Ukendio commented 1 month ago

OnRemove hooks needs to be invoked before it is actually removed as to keep a stable reference to the entity's component data at the moment of deletion.

Possible workaround is implementing a command buffer to defer/enqueue operations, so this isn't a bug but we can make better documentation for this in the future.

chess123mate commented 1 month ago

OnRemove hooks needs to be invoked before it is actually removed as to keep a stable reference to the entity's component data at the moment of deletion.

I don't understand, despite looking over world_remove in init.lua (though I don't understand how the archetypes/columns/etc work). I don't see how finishing the remove operation and then calling the OnRemove hook could cause problems? (I observe that OnSet is called after the assignment as well, so it would be consistent if they both worked the same way.)

so this isn't a bug

I imagine you don't mean that silently undoing operations is intended behaviour, right? (i.e. even if the solution is "implement a command buffer", shouldn't this issue still be open?)

Possible workaround is implementing a command buffer to defer/enqueue operations

I've seen this in another ECS implementation to avoid issues when iterating over entities. I don't recall seeing any documentation that explains whether something might break if you remove and/or add components while iterating via world.query? For both that case and these hooks, I imagine performance would be better (and expected behaviour more intuitive) if a command buffer can be avoided.

Ukendio commented 1 month ago

I don't understand, despite looking over world_remove in init.lua (though I don't understand how the archetypes/columns/etc work). I don't see how finishing the remove operation and then calling the OnRemove hook could cause problems?

When you move the entity, and invoke the hook for the removed component, that means you lose the reference to the entity at the moment of deletion, what happens is that you get the data to it after it has been removed which is useless especially if you are looking to cleanup model references.

I imagine you don't mean that silently undoing operations is intended behaviour, right?

Right, I don't mean to say that it should undo any operations, but generally speaking you do not want to make any structural changes inside of hooks. That is not the intended feature to work with that directly. Hooks are a means of listening to constructing/deconstructing the object. If you need to intercept and make side effects to the structure of the entity then you are encouraged to do that from a system. You can use (Previous, state) pairs to keep state for your data.

I don't recall seeing any documentation that explains whether something might break if you remove and/or add components while iterating via world.query

There isn't such documentation because generally people are good at ensuring structural changes within a query does not re-match the invariant. E.g. adding components that are guaranteed not to rematch the query by using :without will completely circumvent the issue

chess123mate commented 1 month ago

what happens is that you get the data to it after it has been removed which is useless especially if you are looking to cleanup model references.

Though, it wouldn't be useless if #133 were implemented.

but generally speaking you do not want to make any structural changes inside of hooks.

Is this because it could mess with any active queries? (Note: for the original situation I found this bug in, I wasn't in a query, but rather was in response to user input.)

[EDIT: Following section in italics because I may have answered it below]

I recall that in normal Lua, it's not supported to add entries to a table that you're iterating over, but removal is acceptable. Do the same rules apply here for entity components during a world.query? Further, by E.g. adding components that..., are you saying that if you had a world.query(Model) loop, and then you added a different component to an entity (let's call it Outline), that it might mess up the iteration unless you changed the query to say world.query(Model).without(Outline)? But I got the impression that data can sometimes move around when adding or removing entities. If true, this would mean that adding/removing a component on any entity in the list (from world.query) would be unsafe. Is this impression correct?

[EDIT]

I did some testing with a list of entities; I tried changing 1 or more of their components while in a world.query loop and discovered:

Does this seem correct to you?

There isn't such documentation because

(If you mean "documentation isn't needed", I'd argue differently: I've spent hours today studying the code and experimenting to figure the above out.)

Ukendio commented 1 month ago

Is this because it could mess with any active queries? (Note: for the original situation I found this bug in, I wasn't in a query, but rather was in response to user input.)

What I said isn't entirely true but the principle is thar it makes your code less uniform as it changes the control flow of things. That said, is it bad? Not necessarily.

Does this seem correct to you?

Pretty much, if you want to add or remove a component safely from an entity within an archetype, make sure it doesn't rematch the invariant. E.g. if you want to add a component, try to make the query only look for entities without that component currently. And the inverse works as well, if you want to remove a component, make sure the query includes it. Then you will pretty much never run into an issue of iterating an entity twice (or more). Also I am not sure on the implementation that you have suggested with regards to checking the entities array size, I am curious and would appreciate an example of this revision.

(If you mean "documentation isn't needed", I'd argue differently: I've spent hours today studying the code and experimenting to figure the above out.)

The lack of documentation is absolutely an issue and I apologise if it has put you on a goose chase to find these bugs. What I meant is that people have rarely run into an issue before because they are used to making invariants that protect their queries from rematching a moved entity, therefore there has never been a reported bug on this. I absolutely agree with you that this is worth documenting, I am just not sure on how exactly at the moment.

chess123mate commented 1 month ago

I am curious and would appreciate an example of this revision.

-- In world_query, after:
        if length == 0 then
            return EmptyQuery
        end
-- add:
        local count = {}
        for i, archetype in compatible_archetypes do
            count[archetype] = #archetype.entities
        end

-- Then replace all of:
                i = #entities
-- with:
                i = count[archetype]

You could also make count a list that parallels compatible_archetypes so long as when you remove from compatible_archetypes you also remove from count. Then the code would be:

        local count = table.create(length)
        for i, archetype in compatible_archetypes do
            count[i] = #archetype.entities
        end

-- with `i = ` line being:
                i = count[lastArchetype]
-- and additional code whenever `length -= 1` occurs
Ukendio commented 1 month ago

I do not exactly know how that resolves an issue, can you also attach a repro that would cause a bug in the HEAD commit that I can test with? And then I will apply your solution, maybe I will see it then.

chess123mate commented 4 weeks ago
const world = new World()
const A = world.component()
const B = world.component()
const shouldSee = new Map<Entity, boolean>()
// most use archetype A_B, but i === 0 will have just A
for (const i of $range(0, 3)) {
    const e = world.entity()
    shouldSee.set(e, true)
    world.set(e, A, i)
    if (i !== 0) {
        world.add(e, B)
    }
}
for (const [e, a] of world.query(A)) {
    print(e, a)
    if (shouldSee.get(e)) {
        shouldSee.delete(e)
    } else {
        warn("repeated iteration")
    }
    if (world.has(e, B)) world.remove(e, B)
    else world.add(e, B)
}
for (const [e] of shouldSee) {
    warn("failed to iterate over", e)
}

Running this incorrectly outputs repeated iteration.

For the latest version, the fix I indicated requires 2 more changes: