mrjoes / sockjs-tornado

WebSocket emulation - Python server
MIT License
849 stars 162 forks source link

Distributed game server using tornado #50

Open Rustem opened 11 years ago

Rustem commented 11 years ago

Hi, @mrjoes. Now I am working on a game available by servicestargame.com. The main challenge is to find right scale solution. Could you please read the following question and recommend something: http://stackoverflow.com/questions/18802820/distributed-game-server-using-tornado

Thank you in advance.

mrjoes commented 11 years ago

Few comments, questions and maybe hints.

  1. I see that you have separation between frontend servers (sockjs) and game servers (backend). Is this correct?
  2. Is there guarantee that nothing will be lost if game server crashes?
  3. What will happen if game router will crash in your case?

Anyway, in your case, most complex task would be state synchronization. Basically, when client reconnects, you need to send game state back to him along with his personal state.

Lets assume it is card game and state will be quite small for whole game. When client reconnects, all you will have to do is to send cards on hand and history of all moves. Client will replay history and will continue working like nothing happened.

However, if it is hard or impossible to send complete history (it is too big or it is MMO with shared world), server will have to send aggregated game state in a single snapshot.

Now, lets assume you solved synchronisation problem.

Lets check few scenarios.

  1. sockjs server died Not a big deal, client will detect that connection died and will try to reconnect. Load balancer will see that one of the servers is dead and will route connection to other server. Server will forward requests to proper game server like nothing happened. Game server will send current game state to the client.
  2. game server died Well, that's unfortunate. If you have game router monitoring servers and it has list of games that this server was responsible for, it can also reassign games to other servers and ask clients to reconnect. Reason for reconnection - it is impossible to predict if clients are still in sync with the game state. It is not known if some clients received last actions or not. After reconnection all clients will receive same state and will continue playing.
  3. load balancer died DNS round-robin might help in some cases
  4. game router died Only way to prevent it is to have some sort of redundancy (backup master) for game router and make sure its state can be restored by backup game router.
  5. memcached died This might break everything. In best case, state can be queued in server memory until memcached is back. In worst case, whole application will be down. Redis supports replication and it is possible to use redis as hot standby and quickly switch to it.

In either case, most complex items from the are 2 and 4.

Rustem commented 11 years ago

First of all Thank you so much for your answer. Each game server can serve multiple games and users connecting to these games. In more details, Each game is gameFlow object that defined by set of connected sockets (SockJSConnection) and whole game state (players - necessary information about players such as profile details, statistics, and socket information, last event, map) and game loop backend (particular game logic - one loop of card game, or poker) that handles all game events triggered by players (sockets) in real time. Yes there is almost 100 % guarantee (except "it is impossible to predict if clients are still in sync with the game state") that nothing will be lost, because we store each move of game and in case of failure can restart from the failured state.

What I have described in a diagram is just possible solution. If you can advice something better, you are welcome! The role of game router:

  1. launch new game on a loop with the smallest current load
  2. connect and reconnect players to the games

All other interaction must be handled by comet servers directly. I am afraid that game router may be a bottleneck.

You are right, a redundant game router is required in case of failure (round-robin is best suited). By the way, in order to proxy (forward) user's request from router to comet server, I should use sockjs protocol, shouldn't I ? Have you got some examples ?

mrjoes commented 11 years ago

Your architecture looks fine, just take care of narrow points, like I mentioned in previous comment.

While you can forward user requests using sockjs protocol, it is not as efficient as using persistent transports, like ZeroMQ or even raw TCP sockets with some sort of protocol (msgpack, etc).

So, try to make frontend sockjs-tornado servers as stateless as possible, so they will just forward requests to necessary servers.

Rustem commented 11 years ago

Ok. Thank you, I will reply as soon as solve those issues.