swarm-game / swarm

Resource gathering + programming game
Other
820 stars 51 forks source link

Parallel semantics for stepping robots #1925

Open byorgey opened 2 weeks ago

byorgey commented 2 weeks ago

1920 really got me thinking. The main reason concurrency is tricky, as @kostmo says there, is because robots take turns executing, and each robot can see modifications to the world by the previous robot. Thus, currently, if we want to run CESK machines in parallel we would have to work very hard to make sure we respect such dependencies (or else give up on determinism, which I definitely don't want to do).

However, perhaps this is an opportunity to rethink our semantics. I'm kind of thinking out loud here, so I'm not yet sure how strongly I want to argue for this, but this issue is a proposal just for the sake of discussion.

The main idea: parallel semantics

The main idea is to adopt a parallel semantics in which, every tick,

  1. All robots look at the same state of the world from the previous tick, do some computation, and decide on at most one tangible command they want to execute.
  2. All updates to the world then happen simultaneously (with appropriate conflict detection).

i.e. think of the way cellular automata are typically defined: every cell decides what its state is going to be in the next tick based on the state of neighboring cells on the current tick, and then all the cells update at once.

Benefits

I think adopting this semantics could have a lot of benefits:

Downsides

Practical implementation details

Practically, it might work as follows:

  1. Pass the current GameState to all robots and run them for one tick, each one resulting in an updated robot and a set of planned world updates (probably at most one world update, but in theory I suppose there could be more than one). There will be lots of kinds of world updates but they will be things like "place entity E at location (X,Y)", and so on. Each world update also records the robot ID of the robot that caused it.
  2. Collect all the world updates and check for any that conflict (this should be possible to implement efficiently --- it does not require quadratically looking at every pair of updates --- since e.g. we can put the updates in a map by location and check for any that are at the same location, etc.).
    • Note that I don't think robot moves should ever conflict: if robot 1 moves and robot 2 gives robot 1 an item at the same time, that is OK and will not conflict.
    • Really the only thing I can think of that can cause conflicts at the moment is picking up or putting down entities.
  3. Perhaps the trickiest part is that if there is a conflict, we have to go back and update the machines of the robots which issued the conflicting updates (to turn their current machine state from a success into an exception, to put back an entity that the robot thought it was going to place, etc.). But I don't think this would be too bad.
  4. Once we have dealt with any conflicts, we take the resulting set of world updates and apply them to the world all at once.

Impacts

Even though this would technically be a breaking change, in most cases I don't think it would make a difference. Differences would only be observable in the case of conflicts, which are rare. So I expect the vast majority of our test suite would continue to work. Perhaps one or two things where we specifically measure tick timings etc. might need some adjustment.

Having written all this, I think I'm still pretty positive about this idea. Eager to hear others' thoughts.

kostmo commented 2 weeks ago

System robots would need special consideration, since multiple commands can run in one tick with instant.

And in #1736 we achieved the ability for a robot's actions in one tick to be fully responded to by the system before that robot's next tick. There should be some way to preserve that ability.

byorgey commented 2 weeks ago

Very good points, I had not thought of these! But I think they are ultimately solvable:

ltouro commented 2 weeks ago

We would need some new kind of "conflict exceptions" that get thrown when robots try to take conflicting actions. For example, currently, if two robots try to pick up the same rock at the same time, one of them will succeed and the other one will get a "there is nothing here to grab" exception. With a parallel semantics, both robots would get a new "conflicting grab" exception (the "nothing to grab" exception could still happen if a robot executed a grab in an empty cell).

Why not fallback to sequential processing for the conflicting actions? Most of the work would be done at that time and you would get less exceptions for a bit of perf cost. Maybe even derive conflicting sets of actions and process different sets concurrently (if this is indeed need)

byorgey commented 2 weeks ago

Why not fallback to sequential processing for the conflicting actions?

That would certainly be possible. My main argument against it would simply be that it is less theoretically elegant, and it introduces extra complication for little benefit. If we fall back to sequential processing for conflicting actions, then once again to predict what will happen you need to know the secret order in which the conflicting robots will be executed. And if we think ahead to networked play, how will you even decide how to order the robots (since the ID numbers may not be globally unique across different clients)? You cannot just say "mine first, then theirs" since all the participating clients would have to agree on a sequential order to use in the case of conflicts. With a more symmetric, purely parallel approach, on the other hand, there is nothing to agree on, and the robots are truly unordered.

xsebek commented 2 weeks ago

Nice write-up @byorgey! Splitting out the world updates from CESK evaluation sounds great.

Like @kostmo I also immediately thought about instant, but maybe it's not that hard. We could apply a set of primitive world updates and check if they are in conflict with other sets of updates.

I think we could even retrict the instant command if it simplifies this feature. Because of hypothetical computation we should not lose any power.

ltouro commented 2 weeks ago

If we fall back to sequential processing for conflicting actions, then once again to predict what will happen you need to know the secret order in which the conflicting robots will be executed.

I was thinking making this order intentionally random. It would introduce some uncertainty into the environment that might be a fun/challenging new feature in the game itself. I was not considering networked play though.

byorgey commented 2 weeks ago

Like @kostmo I also immediately thought about instant, but maybe it's not that hard. We could apply a set of primitive world updates and check if they are in conflict with other sets of updates.

Yes, exactly, we are going to be dealing with sets of world updates anyway, so it makes sense for one robot to be able to generate an entire set. However, I just realized one tricky thing about this: let's say a system robot executes instant (move; grab; move). What if the grab fails because it conflicts? We actually need to have an ordered list of world updates generated by the system robot, so that if one conflicts we can invalidate all the ones that came after it. And in that case I guess it's not enough to just replace the robot's CESK machine with an error state: we have to actually wind it back to the state right when it executed the conflicting command, and then keep running it from there with an exception (since it might e.g. have a try block around it?).

I don't know, this sounds rather complex. Maybe there is a simpler approach... perhaps system robots never conflict? e.g. if a system robot grabs a rock at the same time as a user robot, then the system robot gets the rock and the user robot gets a "no rock to grab" error. And if two system robots grab a rock at the same time, they both get a rock.

byorgey commented 2 weeks ago

I was thinking making this order intentionally random. It would introduce some uncertainty into the environment that might be a fun/challenging new feature in the game itself.

Ah, I see. That's definitely better than having the order depend on robot ID. Then at least it would be "predictably unpredictable". However, the game is currently completely deterministic, in the sense that executing the exact same commands from the exact same starting state (i.e. same world with same seed) will always yield exactly the same results. I would be sad to lose that property---partly on aesthetic grounds, but partly also practical grounds, e.g. determinism makes possible (or at least easier) potential features like networked play and playback of recorded game sessions.

kostmo commented 2 weeks ago

We should probably develop a benchmark up front to estimate the best-case performance improvement of parallelism. One concern I have is that of "context switching". In some games, the parallelism is distributed "vertically", with one dedicated processor to physics, another to rendering, etc. which minimizes context switching. In our case, "horizontal" parallelism may entail evaluating a batch of robots for a single tick with mapConcurrently. This runs the risk of context switching outweighing the gains of parallelism.

Update: parMap may be pretty lightweight in terms of context switching

byorgey commented 2 weeks ago

Just to record some points from our discussion today: