amethyst / specs

Specs - Parallel ECS
https://amethyst.github.io/specs/
Apache License 2.0
2.51k stars 221 forks source link

Idea: CoWStorage and DeltaStorage #490

Open eaglgenes101 opened 6 years ago

eaglgenes101 commented 6 years ago

Problem

Some operations would like to work on past or potential future versions of a component storage. A fast multiplayer games wants to extrapolate component states from the known past to the present while the packets carrying the actual current state are still in transit. Or a discrete event simulation system wants to avoid component order artefacts by not modifying components until the steady state of the involved components of the involved entities in an interaction can be solved for. Or an AI system wants to be able to simulate possible outcomes of its actions.

The normal fast way to do things like this is to mutate the associated structures in place to the states they had, have, or will have, as needed by the systems operating on it, but this is tricky to get right, and can easily break if implementation details such as the order of systems changes. It coflates the different states that the storages had at different times together into one, and allows for multiple unrelated systems that do not work together to mutate the same structure without any separation of access. In other words, this is bad design.

Solution

First, a CoWStorage trait is exposed. Like FlaggedStorage, it tracks additions, modifications, and removals, but with the modification and removal events, it retains the original versions of the components, allowing the reconstruction of past versions of the storage.

Second, a DeltaStorage trait is exposed. This DeltaStorage can be constructed on top of a CoWStorage, and initially looks the same as its underlying storage. Alternatively, there might be the option to construct one on top of one storage, but with the diff such that it looks like a different storage, but this is much more computationally expensive and has mostly niche uses, so the first will usually be preferred.

When the original storage is modified, it generates events which have enough information for the DeltaStorage to reconstruct the inverse changes and compensate for them, thus keeping the DeltaStorage externally constant. However, this does require some synchronization, as if the original storage changes before the DeltaStorage can account for the changes, a data race and silent modification may be witnessed by DeltaStorage accessors.

To avoid such a mess and minimize contention, the dispatcher can perform a few tricks. First, it can try to schedule the storage accesses so that the modification of the underlying storage does not occur simultaneously with accesses of the DeltaStorage. Second, if this is impossible (say, due to a component unwittingly requesting write access to both the DeltaStorage and the original at the same time) or the dispatcher finds that doing so would cause the code to not utilize multithreading efficiently, it can perform redirect-on-write for the modifying system as well as the accessing ones, and then commit the changes later when the DeltaStorage is uncontended.

Different DeltaStorage implementations can have very different performance implications. A DeltaStorage based on a persistent tree, for example, will suffer from logarithmic update times on write, but is virtually unaffected by writes to the underlying storage provided that the underyling storage is based on the same kind of persistent tree, while a DeltaStorage based on a hash table of changes can be updated in constant time but has to wait if the underlying storage is updated. There is no one-size-fits-all solution here, so there should be multiple implementations, each with their own distinct strengths.

torkleyy commented 6 years ago

To avoid such a mess and minimize contention, the dispatcher can perform a few tricks. First, it can try to schedule the storage accesses so that the modification of the underlying storage does not occur simultaneously with accesses of the DeltaStorage. Second, if this is impossible (say, due to a component unwittingly requesting write access to both the DeltaStorage and the original at the same time) or the dispatcher finds that doing so would cause the code to not utilize multithreading efficiently, it can perform redirect-on-write for the modifying system as well as the accessing ones, and then commit the changes later when the DeltaStorage is uncontended.

I'm not sure what you mean here. Special-casing the dispatcher doesn't sound very good (or possible).

Other than that, I think that such a thing might be useful. I personally never had the need for it though. cc @OvermindDL1

OvermindDL1 commented 6 years ago

The games I made never needed twitch responses so I didn't need too much fancy in terms of interpolation. Rather the event system is what could delay events from happening until a point at which all connected systems 'agreed' that it should run. User input would even work that way but it would still 'do' things locally just with a kind of fake object that wouldn't replicate but would be marked as being replaced by a soon-coming server command (either made 'whole' or the server would deny it and it would vanish). It worked pretty well for me and the more strategic style games I did, but I'm unsure how well it would work for something like an FPS; it might work well, it might not, I never really tested as it's not my style game... >.>

Aceeri commented 6 years ago

Are we mostly talking that the event of Modified/Removed will contain the old component's data when the system reads the events?

eaglgenes101 commented 6 years ago

That, and also a storage wrapper around a base storage that consumes these events to maintain external immutability even while the underlying base storage changes.

OvermindDL1 commented 6 years ago

Are we mostly talking that the event of Modified/Removed will contain the old component's data when the system reads the events?

Hmm, I never did that in my engine but that would definitely be a help in a few areas looking back... If it can be done efficiently I'd be all for that!

Aceeri commented 6 years ago

@eaglgenes101 I don't really follow the whole wrapper/consumer around the base storage. Is there any reason for that? I was just thinking of having:

enum CopyTracked<T: Clone> {
    Modified(Index, T),
    Removed(Index, T),
    Inserted(Index), // This is never called if the component already existed, so doesn't need `T`.
}

struct CopyStorage<C: Component + Clone, U: UnprotectedStorage<C>> {
    storage: U,
    channel: EventChannel<CopyTracked<C>>,
    _phantom: PhantomData<C>,
}

impl<C, U> UnprotectedStorage<C> for CopyStorage<C, U>
where
    U: UnprotectedStorage<C>
    C: Component + Copy,
{
    fn insert(&mut self, id: Index, component: C) {
        self.storage.insert(id, component);
        self.channel.single_write(CopyTracked::Inserted(id));
    }
    fn remove(&mut self, id: Index) -> C {
        let old = self.storage.remove(id);
        self.channel.single_write(CopyTracked::Removed(id, old.clone()));
        old
    }
    fn get_mut(&mut self, id: Index) -> &mut C {
        let old = self.storage.get_mut(id);
        self.channel.single_write(CopyTracked::Modified(id, old.clone()));
        old
    }
    ...
}

And writing an event with a duplicate component of the old data before you modify the base structure.

zvxryb commented 5 years ago

I believe there's a number of other use cases which pose similar problems:

  1. Game state updates; you can delta-code the current game state against the last gamestate a client is known to have before compressing it, provided you maintain the necessary history.
  2. Lag compensation in hitreg, which is slightly different than the client-side extrapolation example given above. On the server side, you may have to "rewind" the game state to validate a "hit" from a player with 30 ms latency, for example.
  3. Asynchronous "quicksave" functionality

I'm wondering if maybe the CoW and Delta storage might be unnecessarily specific. All of the use cases given above and listed here are effectively situations where you need a few frames of game history; you might consider having a fixed number of full gamestate/storage instances to support these features, and rotate through them so you can avoid a lot of unnecessary reallocations and state tracking.

Not sure how this would be done idiomatically in Rust/specs. I'm still learning and just happened to notice this quirk when trying to implement Verlet integration in a simple test application.

zakarumych commented 5 years ago

For hitreg server can just extrapolate gamestate back in time for 30ms to validate hit instead of holding state of relevant objects for few frames

cedric-h commented 5 years ago

Even just a Remove event that lets you access the component data one last time would be great for things like removing bodies from a physics world when their entity/physics component is removed, and creating a death particle component that uses the data in the component to spawn entities accordingly when the entity is destroyed. Because a feature like this doesn't exist, you have store copies of the storages in your systems and synchronize them with changes in the actual components, which creates a lot of overhead for what are conceptually very simple tasks such as these.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Feel free to comment if this is still relevant.

azriel91 commented 5 years ago

unstale -- came up in discord (beginning of thread)

ah I can't reopen, not enough permissions

WaDelma commented 5 years ago

Discord thread cannot be looked at without registering? :/

zvxryb commented 5 years ago

I spent a few days thinking about this and experimenting a bit. I took a bunch of notes during the process on use cases, justification, different alternatives and potential issues.

I also made a version of specs where World and its components all implement Clone. In case anyone is curious what that would look like, the branches are here: The shred branch and the specs branch

And finally I created a small test application which does a particle simulation and uses a rotating group of worlds to maintain history for drawing trails from their trajectories.

Hopefully some of this is useful or provides some ideas.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Feel free to comment if this is still relevant.

OvermindDL1 commented 5 years ago

I don't think this is stale, should be probably be tagged to prevent that until it is finally resolved?

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Feel free to comment if this is still relevant.

Rua commented 4 years ago

@zvxryb I've been thinking of a kind of physics simulation where you read from one world state and write to another, rather than updating everything in-place. Is that what your version aims to do?

zvxryb commented 4 years ago

I'd thought about it; I really like the idea of read-only past state and write-only current state, but I was really just testing to see what was the bare minimum to support history within the framework provided by specs. If you want to go that route, I think Systems will be an issue; they only take a single set of resources (i.e. provided by a single World). So you'd have to forego using systems entirely unless you want to make a bunch of changes to the way they work.

In my test fork all I did was enable cloning full World state so that you could clone then in-place modify data using existing mechanisms.

But to clarify, that fork was really just for evaluating that one idea and isn't any kind of ongoing effort.

aplatypus commented 10 months ago

Good day all ... I'm a true _nub-. herewithin; so it was interesting for me to read the discussion and considering the idea/concept itself. My interest in around things like modelling, rules or meta-data and its uses for good or ill. Or somesuch.

My first reaction was that CoW (copy-on-write) works find for my file-system and even in languages/algorithms. The difference there with respect to what's underdiscussion here, is that essentially we are discussing an optimisation to speed-up code execution/outcomes by, bluntly put, "predicting a future" or "futures", ad hoc. That sounded just like the INTeL chip bugs on speculative execution / fetch situation(s) not so long ago.

On my ZFS volume no one predicts what will be (re-)written to induce the CoW event. Of course there are caches and optimisations available to deal with data flow volume, ensure persistance and immediacy/lag, etc. None of the CoW file system stuff deals with prediction. One might mark a block as "potentially volitile" when it is read as `for-modification' (sometimes done on mainframe middleware, for example). That seems to be more of a form of insurance than "predictive" or speculative.

Coming back to this "CoWStorage and DeltaStorage" proposition ... I conceive of the "DeltaStore" as something akin to a spreadsheet: "what-if ..." scenario (more or less). Or, perhaps I misunderstand the proposal(??).

Such that, one can undo back to where you wish to be; just like your favourite editor or wordprocessor or spreadsheet. One is also empowered to redo and given choice(s) at key junctions to choose a redo- branch/path. Under the current issue; that would need to be something a "DeltaStorage" organises methinks.

The tricky part here is that the only place I know for sure where an undo-redo-tree is actually used in anger was with the Eiffel development IDE editor. This was such a nifty idea, surely others have refined the mechanics. Eiffel editor, however would be a great place to look first. Should one be interested in that as an approach ;-)

Wearing my "state machine" hat ... A gut response is something like when a state becomes: "no longer viable"

In this case, we have a learning opportunity. Eiffel's is big on: "programming by contract".

That is somewhat helpful for an "undo/redo" scenario. It makes quite a bit or logic more straitforward. I suspect having a contract for: pre-condition(s) and post-condition(s) as implemented by Eiffel for the DeltaStorage proposal can lead to a sensible (less hard??) way to manage core DeltaStorage content and/or state.

A short explaination is that a "possibility" (or potential state) is "no-longer-viable" when either of: pre-conditions and post-conditions fail for an entity maintained by the: **DeltaStorage**

... so far so good ... I sense there is more design involved (an research) to understand if that approach has legs. It is definitely something that CAN be discussed. Just saying, there are minimal viable solutions we can deal with first to get something servicable and usable while perfection become Real ;-)