lichess-org / lila-openingexplorer

Opening explorer for lichess.org that can handle all the variants and trillions of unique positions
http://lichess.org/analysis#explorer
GNU Affero General Public License v3.0
135 stars 34 forks source link

Slow index times #205

Closed Camsbury closed 1 year ago

Camsbury commented 1 year ago

Trying to index games locally, and everything is working properly. However, the throughput is like 800-1400 KiB/s, which means it'll be very very long before I can see all of the games in the database. Is there something to toggle here? I tried messing with batch size, and it didn't really help. Thanks!

Camsbury commented 1 year ago

For reference, just using the games from https://database.lichess.org/.

niklasf commented 1 year ago

Yeah, sorry, that's the expected order of magnitude. There just wasn't much motivation to optimize for indexing speed, as long as it can comfortably index games faster than they are played. It sounds a bit better when considering that the decompressed stream is 7x more data, but still not great.

Still, some flags on the server could help a bit:

EXPLORER_LOG=lila_openingexplorer=info cargo run --release -- ...
  1. Add --db-compaction-readahead if not using SSDs.
  2. Add --db-cache <bytes> with a substantial portion of RAM, on the order of 50% of all available RAM.
  3. Set --db-rate-limit <bytes per second> to a very high value, if you do not care about leaving bandwidth for querying the database while indexing.

Restarting the server with new options will interrupt indexing. Starting the indexer again, there's no danger of duplicates, and skipping duplicates is faster than indexing from scratch.

Camsbury commented 1 year ago

Thanks very much!