Sopel97 / chess_pos_db

Database software for chess position statistics. Designed to provide high performance and handle billions of games.
MIT License
3 stars 3 forks source link

Format allowing elo range queries and average elo output #11

Closed Sopel97 closed 4 years ago

Sopel97 commented 4 years ago

consider a new format that allows elo ranges

Sopel97 commented 4 years ago

Revision 1. Have a constant elo granularity, possibly stored in a file if later setting it on creation is to be allowed.

Entry:

On querying total elo is computed and divided by total count to get an average. Total elo should always fit in 64 bit int.

Add a field to the query "average_elo" to specify whether average elo should be returned

Sopel97 commented 4 years ago

For now just a format that allows average elo.

Sopel97 commented 4 years ago

Instead of collecting elo and average elo, which is meaningless, collect elo difference (while elo - black elo). "average_elo_difference" In the future maybe also have the previous one

Sopel97 commented 4 years ago

for games without elos specified assume elo_difference==0

Sopel97 commented 4 years ago

consider dividing elo difference by 10 (properly round) and storing the total in 4 bytes. Assuming +-400 average elo difference for a position it would have to be encountered 50 million times to overflow.

This gives us 4 bytes of padding that we could use for something else.

Sopel97 commented 4 years ago

If nothing better comes up:

Effectively limits max number of games stored to 2^32, but it's a reasonable and a high limit. Requires indirection when reading game data like in db_alpha.

Sopel97 commented 4 years ago

Store elo diff losslessly, use one additional byte taken from hash. Store 5 bytes of elo diff and 3 bytes of part of hash in one uint64, elo diff in most significant bits. struct D here https://godbolt.org/z/xnnAAh

edit. https://godbolt.org/z/SX5noP