Node often lags behind - Githubissues

ko0f commented 2 years ago

Running the latest v7.1.0 through Docker, node is synced and connected to 60 peers. P2P port is open at Docker and forwarded by the router.

I don't see any errors on the log, there a stream of state update messages, but quite often node block lags behind, usually 2-5 minutes.

My feeling is that block propagation is very slow (can see this in full sync).

Any idea of the reason for this and possible ways to resolve?

alexanderbez commented 2 years ago

Is there a reason you're running it through Docker and not natively on the host?

ValarDragon commented 2 years ago

Hrmm, your node being constantly 2-5 minutes behind is behavior I've never seen before.

If its periodic, can you paste your snapshot and pruning settings? Also whats your CPU utilization like and how many cores does the instance have access to?

ko0f commented 2 years ago

Is there a reason you're running it through Docker and not natively on the host?

There are other things running on that server, so a docker is good for decoupling.

If its periodic, can you paste your snapshot and pruning settings? Also whats your CPU utilization like and how many cores does the instance have access to?

It's a physical server i9-12900K with 24 cores, 128GB RAM DDR4 4000MHz. Node is installed on a Corsair MP400 SSD using PCIe 3.0 interface, so shouldn't be a performance bottleneck.

For some period of times it's synced, and sometimes it keeps lagging behind.

snippet from app.toml -

# default: only the last 100,000 states(approximately 1 week worth of state) are kept; pruning at 100 block intervals
# nothing: all historic states will be saved, nothing will be deleted (i.e. archiving node)
# everything: all saved states will be deleted, storing only the current state; pruning at 10 block intervals.
# custom: allow pruning options to be manually specified through 'pruning-keep-recent', 'pruning-keep-every', and 'pruning-interval'
pruning = "default"

# These are applied if and only if the pruning strategy is custom.
# pruning-keep-recent = N means keep all of the last N states
pruning-keep-recent = "0"
# pruning-keep-every = N means keep every Nth state, in addition to keep-recent
pruning-keep-every = "0"
# pruning-interval = N means we delete old states from disk every Nth block.
pruning-interval = "0"

...

###############################################################################
###                        State Sync Configuration                         ###
###############################################################################

# State sync snapshots allow other nodes to rapidly join the network without replaying historical
# blocks, instead downloading and applying a snapshot of the application state at a given height.
[state-sync]

# snapshot-interval specifies the block interval at which local state sync snapshots are
# taken (0 to disable). Must be a multiple of pruning-keep-every.
snapshot-interval = 1500

# snapshot-keep-recent specifies the number of recent snapshots to keep and serve (0 to keep all).
snapshot-keep-recent = 2

config.toml -

#######################################################
###         State Sync Configuration Options        ###
#######################################################
[statesync]
# State sync rapidly bootstraps a new node by discovering, fetching, and restoring a state machine
# snapshot from peers instead of fetching and replaying historical blocks. Requires some peers in
# the network to take and serve state machine snapshots. State sync is not attempted if the node
# has any local state (LastBlockHeight > 0). The node will have a truncated block history,
# starting from the height of the snapshot.
enable = false

# RPC servers (comma-separated) for light client verification of the synced state machine and
# retrieval of state data for node bootstrapping. Also needs a trusted height and corresponding
# header hash obtained from a trusted source, and a period during which validators can be trusted.
#
# For Cosmos SDK-based chains, trust_period should usually be about 2/3 of the unbonding time (~2
# weeks) during which they can be financially punished (slashed) for misbehavior.
rpc_servers = ""
trust_height = 0
trust_hash = ""
trust_period = "112h0m0s"

# Time to spend discovering snapshots before initiating a restore.
discovery_time = "15s"

# Temporary directory for state sync snapshot chunks, defaults to the OS tempdir (typically /tmp).
# Will create a new, randomly named directory within, and remove it when done.
temp_dir = ""

# The timeout duration before re-requesting a chunk, possibly from a different
# peer (default: 1 minute).
chunk_request_timeout = "10s"

# The number of concurrent chunk fetchers to run (default: 1).
chunk_fetchers = "4"

ko0f commented 2 years ago

What can affect the sync speed in Osmosis config?

ko0f commented 2 years ago

Now for instance, it's connected to 56 peers, but is 43 minutes behind and syncs at a rate of 1 block/6 seconds.

p0mvn commented 2 years ago

Your snapshot and pruning settings look good.

There are other things running on that server, so a docker is good for decoupling.

What are the other things running on that server? Have you tried using the same Docker image to run on a different host that doesn't have other things running beside it? I'm wondering if there is some kind of interference or starvation. Also, how is the RAM usage looking like?

It would also be helpful to know how is this node utilized? Are there any special query patterns or query scripts running beside it? Is this a relayer?

faddat commented 2 years ago

@ko0f I just want to mention that your configuration and nvme disk all check out.

I do not know why this is happening.

faddat commented 2 years ago

What can affect the sync speed in Osmosis config?

would you mind walking us through exactly each step that you're using to get this set up, in painful detail?

there's rather a lot that can affect sync speed, but everything you've mentioned sounds good to me.

Is there any chance you're running an exotic filesystem like btrfs?

BTRFS is much slower, but I haven't seen this problem before

ah, -- one place I've seen it:

If you are making lots of queires against the node, that can absolutely cause it to fall behind the tip. Is the pattern perhaps:

you sync the node, and life is good
you do something that queries the node a lot
node falls behind the tip

?

ko0f commented 2 years ago

What are the other things running on that server? Have you tried using the same Docker image to run on a different host that doesn't have other things running beside it? I'm wondering if there is some kind of interference or starvation. Also, how is the RAM usage looking like?

It would also be helpful to know how is this node utilized? Are there any special query patterns or query scripts running beside it? Is this a relayer?

So you don't see an issue with configuration nor can think of a reason why it syncs slows. In your opinion it's something else running on the machine that causes that.

I doubt that, CPU utilization is low, there's plenty of spare RAM and disk isn't stressed. Next time it happens I will try to shut off other stuff and see if it helps.

ko0f commented 2 years ago

Is there any chance you're running an exotic filesystem like btrfs?

Running on Ubuntu using ext4 FS.

If you are making lots of queries against the node, that can absolutely cause it to fall behind the tip. Is the pattern perhaps:

Nope, not querying much.

would you mind walking us through exactly each step that you're using to get this set up, in painful detail?

If you don't mind I'll keep this to last resort :)

ValarDragon commented 2 years ago

Any chance you have prometheus running for the node, and we could see things like how long is block processing taking? cc @p0mvn @bro_n_bro do we have docs for how to do this?

The way I'd progress with debugging personally is:

Try to identify what the slow block number is. (A delay in syncing means either delay in receiving blocks, or delay in processing one)
Go from slow block number to potential explanations of why / rule out things. (Snapshots, Pruning, Epoch p2p network, governance tally)

p0mvn commented 2 years ago

Here's our osmosis-monitor that has instructions on how to set up Prometheus and Grafana with our custom dashboard. Please let us know if you run into issues setting it up @ko0f

ko0f commented 2 years ago

It seems that all blocks are processed slowly -

6 blocks/s
58 peers
Failed tx rate: min 0.6, max 35, avg 5.8, current 8
Free memory: 62%
Swap usage: 30%
CPU usage: min 9.5%, max 42%, avg 28%, current 16%

Currently lag is 3 min (was 5 min behind 2 hours ago), so it's closing but slowly due to slow blocks/sec.

ko0f commented 2 years ago

Now at an ongoing 7 blocks/s, 1 hour lag.

Any ideas what to check?

ValarDragon commented 2 years ago

So the main bottleneck of cosmos chains is disk IO ops. Your system has 30% of memory in swap, which is pretty concerning. Is there a way for us to see what the RAM usage over time has been or if reads are going to disk? This is pretty surprising, as v7.1.0 for the most part should have mitigated the RAM leak. (An upcoming release thats blocked on PR review should mitigate it further)

ko0f commented 2 years ago

@ValarDragon I don't think the swap memory is being used by Osmosis, it's pretty much static and there's plenty of free RAM memory - 44GB free RAM.

p0mvn commented 2 years ago

Hi @ko0f I'm going to close this issue since it has been stale for several months now. Please let us know if you're experiencing any problems still

osmosis-labs / osmosis

Node often lags behind #1237