Use a transactional memory and computing server

BertrandBordage commented 11 years ago

This is IMHO an extremely important feature, even though it can be postponed a bit.

The problems:

Saves are huge (30 Mo for a 200² map!).
Saving and restoring data is awfully long.
The same process handles terrain generation, game logic and rendering
The 3 previous points are ruining performance
We will one day need to build a server

How the original Minecraft solved these problems:

Only revelent data are saved; in our game we save nearly everything, including caches and textures (which leads to incompatibilities when the code is updated).
Because of 1., they don't know this problem; if we do, it's because pickling (what we currently do when we save) is an extremely complex calculus.
This was never really solved, that's why chunk updates are sometimes extremely slow (you know, when there are holes everywhere), even on the best computer of the world.
Visual rendering has the priority, chunks are updated only when the CPU has the time to do it. Nowadays, rendering is smooth, but it wasn't a year ago because of this problem (if I remember well).
I don't know how exactly they solved this. But the fact that it was impossible until recent days to play a network game without starting a standalone server tends to prove that they basically had to write the terrain generation and game logic twice: one time for the game, one time for the server.

The definitive solution to these problems is to separate this game in two parts: the pyglet part (rendering, handling key events, etc) and the rest.

So, we will have one process dedicaced to rendering, and one or several processes dedicaced to the complex world calculus (like terrain generation, physics, IA, etc). The rendering process will do requests to the memory and computing server to know, for example, which sectors to display considering the current player position, etc. All these processes starting automatically using python main.py.

Consequences:

Excellent game stability.
Possibility to make very complex systems, such as the basic physic engine that always lacked in the original Minecraft. Considering that most gaming machine now have at least 4 cores, 2-3 processes dedicaced to complex tasks sounds reasonable to me.
Easy way to communicate with the game, which means that it will be easy to make a multiplayer game out of it.

How to achieve this? As a web developper, I immediately think that I should do a web server coupled with an asynchronous task queue. For example, using Flask and Celery. This may seem inappropriate, and it is maybe too heavy or slow to work. I don't know what data storage we should use, but even SQLite would be faster than pickling everything.

I also read on Wikipedia that Durus was done for such purposes.

What do you think? Totally nuts?

boskee commented 11 years ago

Have a look at this:

http://www.bravoserver.org/

BertrandBordage commented 11 years ago

:D Thanks! It uses Twisted! I prefer Flask, but it tends to prove that it could work!

Nebual commented 11 years ago

I've been wanting to fix 1 and 2 by writing a custom save process for the terrain, but as soon as I do that, it'll mean change to the terrain engine will require changing the saver too, so I figured it'd speed up development to leave it as is until its more stable.

Separating the game into a Client and Server is a separate matter, and very important.

Edit: Split my points about saving into #42

BertrandBordage commented 11 years ago

I still think that it's not separate matters. For the moment, worlds are tiny. When they will become a little bigger (not much, just a little, like 10 times the current scale), even with optimizations, pickling won't be possible. What storage system will you use? Still have the entire world stored in the RAM, fully loaded at startup then fully saved at exit? In a world with 600 sectors (ridiculously small compared to any 30 minutes game in Minecraft), python already takes 1.2 Go! And you would still have to modify you saving system each time a modification is done in the code.

However, I would sincerely love to see a nice system to fix 1. and 2. without having to do a multiprocessed game yet. I therefore want to see your solution as soon as possible :)

ghost commented 11 years ago

@BertrandBordage We don't have to store entire world in RAM - we can separate world sectors into multiple files, store only the nearest sectors in RAM and delete not used sectors from memory.

But there is another problem: Storing each sector in new file, will create a massive file list on the disk...

BertrandBordage commented 11 years ago

But you will have to write a complex system to choose which file to choose, open it, understand its syntax, etc. That represents a lot of code and a lot of execution time. What I see is that we could use spatial databases to handle all this. A simple lookup would be enough to get a sector. And because database are transactional, you have no chance of corrupted data.

ghost commented 11 years ago

Yes, database is better solution.

Nebual commented 11 years ago

I think a proper database implementation is more suited to when you need to pull out little chunks of specific data. I don't think it would scale as well for "load information on the following 30,000 blocks" (I think thats our current drawdistance). And Bertrand, said database implementation would certainly be larger than anything we'd write for the purpose :P

ghost commented 11 years ago

So what type of database should we choose - small SQL (SQLite), "normal" SQL database (for instance MySQL) or some NoSQL solution? In my opinion, we should use either normal SQL database (for functionality) or NoSQL (for speed and scalability?). I think that we should already choose best solution, so we won't have to change database later (but SQLite is also good option for its lightness and relatively easy incorporation into our project)

ronmurphy commented 11 years ago

coming in to this form .net, MS pushes SQL a lot.. and it is ok. However, i have heard good things about MariaDB. See here > https://mariadb.org/

BertrandBordage commented 11 years ago

@Nebual we need to do some tests before, of course. I will make a try this week. We will see. And I don't think the code will become much larger. We'll see :)

@avelanarius I will try with an ORM (SQLAlchemy for example), so we don't really have to chose. The best open-source SQL server (in terms of performance) is PostgreSQL, but testing with SQLite will of course be the better solution.

To be frank, I have nearly no experience in spatial databases, so I don't guaranty that SQL will be the perfect solution. But if it works, developping will be much easier… No need for a save system and no need for complex lookups written in Python.

Nebual commented 11 years ago

Okay I just restructured our current saving code so its all within savingsystem.py, and I separated saving blocks from other data (player information), since its blocks that require SQL. https://github.com/Nebual/minecraft/tree/saving

We'll still need plenty of code for interfacing with SQL, just like we would for interfacing with flatfiles. Using SQL doesn't simplify things, though it may be faster so we should test.

BertrandBordage commented 11 years ago

I just made a test using Python sqlite3. To insert one million 3D coordinates, it took about 3 seconds (on a classic HDD, not on a SSD). To fetch this million, it takes about a second. To fetch 30 000, it takes 22.5 ms.

This looks convincing, I though SQLite was dozens of times slower than that :D

ghost commented 11 years ago

SQLite might be slower with bigger (GBs) databases.

BertrandBordage commented 11 years ago

@avelanarius Hmm… With indexes, shouldn't this be fast enough?

Anyway, if we store both blocks and sectors in a database, we could only fetch a few close sectors (200 closest for example) that are not shown (which limits to less than 20 to 50), then fetch the exposed blocks contained in these sectors. In a one SQL line, we could get all blocks to add to the batch. Considering that we get something like one hundred new blocks per frame, it makes a 80 µs request/response per frame. That leads us to 4.8 ms / s of database lookup for sectors updates. Using SQLite. With PostgreSQL, it is at least 2 to 10 times faster… I like open source :')

BertrandBordage commented 11 years ago

Flask or any http server isn't the right solution, for two reasons:

An http request/response is taking too much time. Between 1 and 2 ms for a basic request/response using Flask and urllib2.
For terrain generation, the client doesn't need to wait for the server to respond. And the server would take some time to respond, even if we use Celery.

I suggest we use something more low-level, like socket. A request/response takes about 0.3 ms and client/server are not forced to wait for each other.

Jimx- commented 11 years ago

I prefer socket too(though flask is more convenient).

r58Playz commented 5 years ago

(on a classic HDD, not on a SSD)

The new saving system takes about half a second using a SSD.

boskee / Minecraft

Use a transactional memory and computing server #40