valeriansaliou / sonic

🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.
https://crates.io/crates/sonic-server
Mozilla Public License 2.0
20.13k stars 578 forks source link

Terminal crashes when pushing to sonic #289

Closed philiure closed 2 years ago

philiure commented 2 years ago

I am pushing a dataset of 12M documents to Sonic, but the terminal crashes due to memory issues at 2% of the push. I am running a Rust Server in release mode, and wonder why the terminal keeps crashing. Any advice on solving these memory issues is welcome. The size of the dictionary object containing the documents is 0.67 GB.

Activity Monitor indicates 70+GB of memory use for terminal upon crashing.

ghost commented 2 years ago

How are you pushing the dataset? Memory is not necessarily used by Sonic, your ingest program can also use memory. I didn't have any problems when pushing 25M documents (~2 GB) using node-sonic-channel.

philiure commented 2 years ago

Found the problem! It was the logs from the server in the terminal that crashed it. Running the server with sonic > /dev/null fixed it and improved my push speed by 50%!

philiure commented 2 years ago

The push seemed to have slowed down significantly, at the start it was 2.5 ms/doc, and now it seems 50:ms/doc how much time did it take you to push 25M documents @Sly-Little-Fox ?

ghost commented 2 years ago

The push seemed to have slowed down significantly, at the start it was 2.5 ms/doc, and now it seems 50:ms/doc how much time did it take you to push 25M documents @Sly-Little-Fox ?

~4 hours with default config (with loglevel "error" though, "debug" can actually slow things down). I used tmpfs for storage and one core (Ampere A1, Oracle Cloud). I actually observed that using a lot of threads for ingesting sometimes makes Sonic slow down significantly even though it doesn't use even a half of my threads (8 threads, 4 cores). I don't know why it gets stuck (it's not I/O, because tmpfs).

valeriansaliou commented 2 years ago

Note that write operations are not lock-free, so if something is I/O bound, eg. the SSD I/O, the CPU core responsible for the RocksDB threads (try increasing parallelism?), then things will start slowing down as Sonic Channel and other threads rely on this main DB threads during write operations.

philiure commented 2 years ago

Alright, I'll see if I can improve it. Thanks a lot!