enviodev / hyperindex

📖 Blazing-fast multi-chain indexer
https://envio.dev
66 stars 5 forks source link

Indexing error: out of memory #262

Open pavlovdog opened 3 days ago

pavlovdog commented 3 days ago

Describe the bug Self-hosted indexer works properly for some short period of time (5-10 minutes), then logs stop and memory consumption starts to grow. When it hits the limit, it fails with heap out of memory. It also feels that during that time, /metrics and /healthz response time starts to grow (discord message link).

2024-10-09 11:19:17.347 
2024-10-09 11:19:17.347 ----- Native stack trace -----
2024-10-09 11:19:17.347 FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2024-10-09 11:19:17.347 
2024-10-09 11:19:17.347 <--- JS stacktrace --->
2024-10-09 11:19:17.347 
2024-10-09 11:19:17.347 
2024-10-09 11:19:17.347 [17:0x7fb7b5b576b0]   369477 ms: Mark-Compact 4032.6 (4134.6) -> 4018.5 (4136.3) MB, 4261.67 / 0.00 ms  (average mu = 0.053, current mu = 0.016) allocation failure; scavenge might not succeed
2024-10-09 11:19:17.347 [17:0x7fb7b5b576b0]   365147 ms: Mark-Compact 4031.0 (4132.8) -> 4016.8 (4134.6) MB, 3964.42 / 0.00 ms  (average mu = 0.089, current mu = 0.018) allocation failure; scavenge might not succeed
2024-10-09 11:19:17.347 
2024-10-09 11:19:17.347 <--- Last few GCs --->
2024-10-09 11:19:17.347 

Here is the list of env options I'm using:

NODE_OPTIONS: "--max-old-space-size=4096"
TUI_OFF: "true"
LOG_LEVEL: "trace"
LOG_STRATEGY: "ecs-console"
ENVIO_API_TOKEN: "..."

ENVIO_PG_HOST: "..."
ENVIO_PG_PORT: "..."
ENVIO_PG_USER: "..."
ENVIO_POSTGRES_PASSWORD: "..."
ENVIO_PG_SSL_MODE: "..."
ENVIO_PG_DATABASE: "envio-3"

UNORDERED_MULTICHAIN_MODE: "true"

Local (please complete the following information):

Hosted Service (please complete the following information):

Additional context Seems to happen only on self-hosted environments, can't see the error logs in the hosted service. But worth checking twice, since I have no access to the restarts counter and there is no log search. Feel free to reach out (https://t.me/p0tekhin), if you have any questions.

JonoPrest commented 1 day ago

Just taken a look at the indexer you linked and it is huge! 😅

I think the first thing is if you have a lot dynamic contract registrations try and set ENVIO_MAX_PARTITION_CONCURRENCY, it defaults to 10, this value * the number of chains could mean many requests are happening and resolving at the same time and values can't be garbage collected sequentially.

Secondly although it shouldn't have too much effect IMO is you can set MAX_QUEUE_SIZE. It defaults to 100,000 but this is divided per chain. So your queues shouldn't be too big per chain.

To be clear the MAX_QUEUE_SIZE is simply the threshold of size the queue reaches before it stops making requests. There's no way currently to limit the number of events returned via hypersync query (which can be many thousands at a time).