Closed Amxx closed 4 years ago
A good way to solve this in graph-node would be delete all but the REORG_THRESHOLD
/ANCESTOR_COUNT
blocks in the database when ingesting blocks. This would establish a fixed-size window of blocks that are kept around. I'd even say this should be the default mode.
Cleaning up old blocks is fairly easy. A CLI flag + env var + cleanup query that is executed frequently enough is all it should take.
I wrote a quick temporary cleanup script that can be used in cron jobs:
#!/bin/bash
# For each network, delete blocks older than 75 blocks
psql graph 2>&1 >/dev/null <<EOF
DELETE FROM ethereum_blocks
USING ethereum_networks as network
WHERE network_name = network.name
AND number < network.head_block_number - 75
EOF
# Reclaim disk space
psql graph <<EOF
VACUUM (FULL, VERBOSE, ANALYZE) ethereum_blocks;
EOF
# Confirm the number of remaining blocks per network
psql graph <<EOF
SELECT network_name, count(*)
FROM ethereum_blocks
GROUP BY network_name
EOF
The difficult bit is that we need to be careful to not delete blocks that a subgraph that is currently syncing needs (i.e., in the time between a subgraph finding out there is a new block and trying to get it from the DB)
One simple way to avoid that is to prune all blocks that come before the minimum of the current head of any syncing subgraphs, or before the chain head if there are no syncing subgraphs. In either case, we'd define 'before block' with some reasonable backoff (like 1 block in the case of syncing subgraphs and ANCESTOR_COUNT
if there are no syncing subgraphs) If this is done continually (say, every 100 blocks as we ingest them) we should never have a large number of blocks in the database.
One warning about vacuum full
: that should not be used on a production system as it takes an exclusive lock and will block all access to the table; it also takes more disk space since it copies the live contents of the table to a new file; worst case you'll temporarily need 2x the disk space.
After just 10 days running my own graph-node, the VM ran out of disk. The source is the
ethereum_blocks
in the postgres table being over 8GB from ~850 000 blocks.My graphnode is looking at 5 blockchains, 4 of which have ~4sec between blocks. The fifth one is mainnet.
This is clearly not sustainable. We have dedicated (archive) nodes that have large disks, but we expected the graphnode to be light (which is essential to decentralization).
I understand these blocks are used for handling reorgs, so maybe have a rolling window that drop an old one when a new block is received (about a few hours should be enough). The simplest way might be to have a cron job running every hours and dropping all block older then ~6 hours.