ordinals / ord

👁‍🗨 Rare and exotic sats
https://ordinals.com
Creative Commons Zero v1.0 Universal
3.85k stars 1.38k forks source link

What are the best practices for ord as a service? #3967

Open ArthurQiuys opened 1 month ago

ArthurQiuys commented 1 month ago

In case of errors, database files often get corrupted and need to be re-indexed.

raphjaph commented 1 month ago

We use systemd you can have a look at our service files in deploy directory. You just have to watch out to properly shut down ord before attempting to move or rename the index.redb file.

emilcondrea commented 1 month ago

I confirm that it crashes and recovery takes very long time. The weird thing is that even if its at the tip, doing nothing, just querying bitcon node, it corrupts the db if it crashes.

I am wondering if its related on how the index is opened even there is nothing to do.

raphjaph commented 1 month ago

How does it crash? How are you shutting it down? Or does it just randomly crash?

emilcondrea commented 1 month ago

Its not the shutdown scenario, if I recall correctly it crashed because of OOM, which definitely means memory settings need to be tweaked on my end, but the thing I wanted to raise is that indexer should not corrupt the database if its just noop-ing querying bitcoin node.

ArthurQiuys commented 1 month ago

What is the appropriate memory size to set? I want to know how many concurrent queries an ord with 32g memory can handle.

raphjaph commented 1 month ago

What is the appropriate memory size to set? I want to know how many concurrent queries an ord with 32g memory can handle.

Only the initial indexing takes a lot of memory. Once the index is built it can handle quite a lot of requests. We have a server that handles 5 million requests per day and sits at less than 5% CPU usage normally. And at about 70% RAM usage with 128gb of RAM.

raphjaph commented 1 month ago

Its not the shutdown scenario, if I recall correctly it crashed because of OOM, which definitely means memory settings need to be tweaked on my end, but the thing I wanted to raise is that indexer should not corrupt the database if its just noop-ing querying bitcoin node.

To build the full index with --index-sats you need at least 64gb of memory. Once the initial indexing is done corruption is very unlikely since that can only happen while flushing the cache to disk and in idle mode (following chain tip) that takes less than a second.

ArthurQiuys commented 1 month ago

What is the appropriate memory size to set? I want to know how many concurrent queries an ord with 32g memory can handle.

Only the initial indexing takes a lot of memory. Once the index is built it can handle quite a lot of requests. We have a server that handles 5 million requests per day and sits at less than 5% CPU usage normally. And at about 70% RAM usage with 128gb of RAM.

Can systemd automatically restart in case of unexpected situations such as oom? Currently, the db cannot be recovered after the unexpected exit we have encountered, and can only be re-indexed

victorkirov commented 1 month ago

For @emilcondrea 's point about the index being corrupted, this has happened to us a few times. The issue is that the index updater requires a write transaction to run the update function:

pub(crate) fn update_index(&mut self, mut wtx: WriteTransaction) -> Result

This is required even if the Ord index is already at tip. Since the updater runs quite often, if there is a critical crash of any kind, the index will be corrupted due to that write transaction being open.

Instead, the updater could get the tip indexed height and the current tip height from the btc node (and hashes to check for reorg) before it opens the write transaction. If the index is already at tip and no reorg occurred, then the loop can just continue without ever opening the write transaction and risking the index corruption.