graphprotocol / graph-node

Graph Node indexes data from blockchains such as Ethereum and serves it over GraphQL
https://thegraph.com
Apache License 2.0
2.89k stars 962 forks source link

[Bug] Graph Node consumes 6 GB of memory during local dev in docker compose #4644

Closed paymog closed 1 year ago

paymog commented 1 year ago

Bug report

I'm running graph node in a docker compose set up as part of larger infrastructure I'm building. I'm finding that when I deploy subgraphs to the graph node it starts eating up memory and eventually gets killed when it reaches ~6GB of usage.

Here's my full config file:

[store]
[store.primary]
connection = "postgresql://graph-node:let-me-in@postgres:5432/client__dev"
pool_size = 10

[deployment]
[[deployment.rule]]
shard = "primary"
indexers = [ "default" ]

[chains]
ingestor = "default"
[chains.mainnet]
shard = "primary"
provider = [
    { label = "mainnet-default", transport = "rpc", url = "http://host.docker.internal:3000/mainnet/default", features = [] },
    { label = "mainnet-archive", transport = "rpc", url = "http://host.docker.internal:3000/mainnet/archive", features = ["archive"] },
    { label = "mainnet-archive-traces", transport = "rpc", url = "http://host.docker.internal:3000/mainnet/archiveAndTraces", features = ["traces", "archive"] },
    { label = "streamingfast-firehose-mainnet", details = { type = "firehose", url = "https://mainnet.eth.streamingfast.io", features = [ "filters" ], token = "<token>" }},
]
[chains.avalanche]
shard = "primary"
provider = [
    { label = "avalanche-default", transport = "rpc", url = "http://host.docker.internal:3000/avalanche/default", features = [] },
]

Here are are the env vars I set when I start up the graph node container:

    environment:
      GRAPH_NODE_CONFIG: /etc/graphnode.toml
      ipfs: "ipfs:5001"
      GRAPH_LOG: debug
      node_role: "combined-node"

I'm running graphprotocol/graph-node:v0.31.0-rc.0 in my docker compose set up and I'm seeing this issue with older images like 0.27.0 as well.

One thing I notice is that if I deploy ~5 eth blocks subgraphs against mainnet and then they catch up, every new block causes graph node to consume ~100-200MB more memory. This happens even though the graph node never receives any queries so I'm fairly sure it's not the query cache that's eating up all the memory.

Are there environment variables I can use to tune how much memory this graph node consumes?

Relevant log output

No response

IPFS hash

No response

Subgraph name or link to explorer

No response

Some information to help us out

OS information

macOS

azf20 commented 1 year ago

are you using a Mac M1? And what version of Docker Desktop are you running?

paymog commented 1 year ago

Yup, using a M1. Running docker desktop 4.12.

I decided to build and host a multiarch image of graph node myself and when I switched to using that image memory usage dropped to 200MB - it was definitely the emulation which was causing the crazy memory usage.