licoffe / POE-Stash-indexer-NG

High performance Path of Exile stash indexer
MIT License
43 stars 12 forks source link

Disclaimer

This indexer is for you if you have:

Introduction

High performance multithreaded indexer for the Path of Exile Stash API coded in C++11.

JSON files are pre-downloaded ahead through a first thread. In a second thread, downloaded files are parsed and data inserted into the database. Insertion leverages the LOAD DATA INFILE MySQL statement for increased insertion speed.

How it works

Two threads are used:

As files are downloaded through the first thread, their name is added to a queue. This queue is read by the second thread which progressively parses the corresponding files.

The second thread iterates over stashes, inserting items one by one. Item properties (properties, affixes, requirements and sockets) are however stored for a delayed insertion. When a certain amount of JSON files have been parsed, item properties are written to text files and inserted through 4 different threads (properties, affixes, requirements and sockets).

Libraries used

JSON files are downloaded with gzip compression using libCurl.

The RapidJSON library is used to parse JSON files.

Storage is done through MySQL (see schema.sql) and the interface with MySQL is done through the Connector/C++ library, the official MySQL C++ communication API.

Performances

Hardware

Hardware Intel Xeon W3520 (4 cores / 8 threads at 2.66 GHz+) with 16 Go RAM and SSD storage.

MySQL settings

innodb_buffer_pool_size = 12G
innodb_log_file_size = 2047M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_file_per_table=1
innodb_stats_on_metadata = OFF
innodb_buffer_pool_instances = 8
query_cache_type = 0
innodb_lock_wait_timeout = 50
innodb_log_buffer_size = 8M

Results

The indexer memory consumption is very small, about 300 MB (2%).

The plot below shows the evolution of item insertion speed across time.

alt

Installing required libraries

The first thing you need is to have a running MySQL installation correctly tuned for your hardware.

To run the indexer, you will need to install the required dependencies:

MacOS

On MacOS, download Mysql-c++-connector. Extract the archive and copy the content of the lib folder to /usr/local/lib, and the content of the include folder to /usr/local/include. Next, download RapidJSON and copy the content of include to /usr/local/include.

Ubuntu

On Ubuntu, install the following packages using apt-get install libcurl4-gnutls-dev libcurlpp-dev libmysqlcppconn-dev. You may have to edit your MySQL configuration file (my.cnf) to authorize the LOAD DATA INFILE statement. Next, download RapidJSON and copy the content of include to /usr/local/include.

Compilation and setup

When all dependencies have been installed, open up a terminal and change directory to the indexer one and type make to generate the binary. If all dependencies have been found, the code should now be compiled. Import the schema.sql DB scheme in your MySQL installation. The last step is to edit the config.cfg file to put the right credentials in.

When compiled, just run the indexer using ./indexer. The program also allows to start from a specific change_id using ./indexer change_id.