Created using Gliffy
Class | Description | functions | attributes |
---|---|---|---|
wows_api | ... | ... | ... |
abstract_db | ... | ... | ... |
prediction_model | ... | ... | ... |
web_connector | ... | ... | ... |
This python based script handles World of Warships API request for statistical data and store them in local MySQL database. The World of Warships API needs an application_id for credential connection with the API server, the application_id should be registered on Wargaming.net and stored in a local configuration file named as "config.json". Also the ip address of the terminal running this script (provided by package ipgetter) should be added in your application launched on developer room of Wargaming.net.
There are several limitations, as well as specific JSON format regarding different types of the API request (refer to Wargaming.net API reference), please check based on your need.
Since the API request returns JSON format data, it is natural to use MongoDB (BSON) for data storing. The newest and historical stats of a player differ a little. To be consistent with the data, we store the newest stats and historical stats differently.
The script connects relational database (MySQL, AWS RDS, etc.) for storing extracted data. The players' id list is stored in an individual table wows_idlist
, which is essential for efficient API request since the complete id list is not officially provided, and the account number is sparsely distributed in a large range (WOWS account number range). Some statistics like the number of battles are stored in wows_stats
, and you can customize your own database as well.
The players' statistical data can then be retrieved through SQL and analyzed for your own purpose.
We replaced the MySQL with MongoDB due to the performance limitation.
{
"_id":1008331251,
"daily_stats":{
ObjectId('000000201701011008331251'),
...
},
"account_id": 1008331251,
"nickname": "zmlzeze",
"last_battle_time": 1500140223,
"leveling_tier": 15,
"created_at": 1435322987,
"leveling_points": 8612323,
"updated_at": 1500053592,
"private": null,
"hidden_profile": false,
"logout_at": 1500053581,
"karma": null,
"statistics": {
"distance": 117155,
"battles": 3143,
"pvp": {
...
}
},
"stats_updated_at": 1500140964
}
{
"_id":ObjectId('000000201701011008331251'),
"capture_points": 399,
"account_id": 1008331251,
"max_xp": 4913,
"wins": 1742,
"planes_killed": 5550,
"battles": 2882,
"damage_dealt": 213130514,
"battle_type": "pvp",
"date": "20170101",
"xp": 3923528,
"frags": 3612,
"survived_battles": 1356,
"dropped_capture_points": 3629
}
The database provides stats for modeling and web application, thus the performance is crucial. For NA server, the player number is about 1.6 million, and about 30% play at least 100 battles (considered as valid players). Since each player has daily update, the total number of historical stats will keep increasing with time. Based on estimation, the newest stats for 1.6 million players take up to 2 GB memory, while the historical stats of valid players over a year take about 50 GB memory on disk.
When retrieving players' data from database, we use pandas
Panel to construct the 3D DataFrame as:
ID\day | 1 | 2 | 3 | ... |
---|---|---|---|---|
10001 | [t,w,l,d] | [t,w,l,d] | [t,w,l,d] | ... |
10002 | [t,w,l,d] | [t,w,l,d] | [t,w,l,d] | ... |
10003 | [t,w,l,d] | [t,w,l,d] | [t,w,l,d] | ... |
... | ... | ... | ... | ... |
The [t,w,l,d]
is the vector of one day's stats of [battles,wins,losses,draws]
.
We use the LSTM without attention model to predict the players' performance based on previous days' stats. The prediction is within certain time window and the objective is to minimize the distance between the ground truth and predicted stats vectors:
{
"wows_api": {
"application_id": "XXX",
"player_url": "https://api.worldofwarships.com/wows/account/list/",
"account_url": "https://api.worldofwarships.com/wows/account/info/",
"stats_by_date_url": "https://api.worldofwarships.com/wows/account/statsbydate/",
"DB_TYPE": "mongo",
"DATE_FORMAT": "%Y-%m-%d",
"NA_ACCOUNT_LIMIT_LO": 1000000000,
"NA_ACCOUNT_LIMIT_HI": 2000000000,
"ID_STEP": 100,
"SIZE_PER_WRITE": 10000,
"URL_REQ_DELAY": 0,
"URL_REQ_TIMEOUT": 45,
"URL_REQ_TRYNUM": 3
},
"mysql": {
"dbname": "XXX",
"usr": "XXX",
"pw": "XXX",
"hostname": "XX.XX.XX.XX",
"port": 123
},
"mongo": {
"dbname": "XXX",
"collection": "XXX",
"usr": "XXX",
"pw": "XXX",
"hostname": "XX.XX.XX.XX",
"port": 123
},
"AWS_RDS": {
"dbname": "XXX",
"usr": "XXX",
"pw": "XXX",
"hostname": "XX.XX.XX.XX",
"port": 123
}
}
We use the Flask framework to develop the front-end web application with Python back-end.
More projects on my private repository summary