Requesting map/data-var/estimate-fee timeout

bestmike007 commented 3 months ago

Describe the bug We're seeing timeouts when requesting map/data-var/estimate-fee after upgrading from 2.5.0.0.3 to 2.5.0.0.4.

Steps To Reproduce

Run a stacks-node
Run this script: https://gist.github.com/bestmike007/d77b582bea2fda1d4b2af272b8e42fba

On a 2.5.0.0.4 node, it fails with timeout after a few minutes, while on 2.5.0.0.3, it runs smoothly and the CPU usage is more stable.

Expected behavior It should be as stable as it was.

Environment (please complete the following information):

OS: Debian + Docker
Docker Image: blockstack/stacks-core:2.5.0.0.4

Additional context Not seeing any error logs

jcnelson commented 3 months ago

Hey! We tried reproducing this today and were unable to get the 2.5.0.0.4 to go more slowly. It's probably because our deployments are different than yours. Can you tell us more about how this node was set up to run when you tested it? For example, did it have a public IP? Was it running in a particular cloud provider? Stuff like that.

bestmike007 commented 3 months ago

I was running the test on a DigitalOcean droplet with 8vCPU Premium Intel, 16G memory, and 480G local nvme SSD. It does have a public IP with ports 20443, 20444 open (but there was no other request logs when the test was running).

It has a stacks node api version 7.11.1 running, configured as the event observer.

Both stacks node and api are running with docker.

I'm observing timeouts after upgrading both the stacks-node and the api (previously 7.11.0-beta.1) on the same droplet. This is the first node I tried to upgrade and other nodes work fine.

Let me do this on a fresh new node with a recent snapshot of yours without api sidecar, to see if it is still reproducible.

bestmike007 commented 3 months ago

Here's what I did:

Spin up a new cloud vm: Debian 12 x64, 4vCPU Intel, 16GB memory, and 320GB nvme SSD
Restore from the snapshot: curl https://archive.hiro.so/mainnet/stacks-blockchain/mainnet-stacks-blockchain-2.5.0.0.4-latest.tar.gz | tar -zxv
Start stacks node with docker-compose and wait for it to catch up
Run the script: https://gist.github.com/bestmike007/d77b582bea2fda1d4b2af272b8e42fba

services:
  stacks-core:
    restart: always
    image: blockstack/stacks-core:2.5.0.0.4
    container_name: stacks_node
    command: stacks-node start --config /srv/Stacks.toml
    network_mode: host
    environment:
      NOP_BLOCKSTACK_DEBUG: 0
      XBLOCKSTACK_DEBUG: 0
      RUST_BACKTRACE: 0
    volumes:
      - ./Stacks.toml:/srv/Stacks.toml:ro
      - ./mainnet:/srv/stacks-node/mainnet

[node]
working_dir = "/srv/stacks-node"
rpc_bind = "0.0.0.0:20443"
p2p_bind = "0.0.0.0:20444"
bootstrap_node = "02196f005965cebe6ddc3901b7b1cc1aa7a88f305bb8c5893456b8f9a605923893@seed.mainnet.hiro.so:20444,02539449ad94e6e6392d8c1deb2b4e61f80ae2a18964349bc14336d8b903c46a8c@cet.stacksnodes.org:20444,02ececc8ce79b8adf813f13a0255f8ae58d4357309ba0cedd523d9f1a306fcfb79@sgt.stacksnodes.org:20444,0303144ba518fe7a0fb56a8a7d488f950307a4330f146e1e1458fc63fb33defe96@est.stacksnodes.org:20444"
wait_time_for_microblocks = 10000

[burnchain]
chain = "bitcoin"
mode = "mainnet"
peer_host = "bitcoin.mainnet.stacks.org"
username = "stacks"
password = "foundation"
rpc_port = 8332
peer_port = 8333

First run failed with timeout after 80s:

Second run failed after 471s:

Third run failed after 493s:

In the mean time I was running the exact same script on an old node with version 2.5.0.0.3, and it is still running without any issue.

bestmike007 commented 3 months ago

Update: the script on 2.5.0.0.3 is still running without timeouts.

@jcnelson This is a new cloud vm in isolated vpc, so lmk if you need access to it, otherwise I'll tear it down.

bestmike007 commented 2 months ago

Looks like it's fixed in 2.5.0.0.5. Looks like the default antientropy_retry config is somehow related: https://github.com/stacks-network/stacks-core/compare/2.5.0.0.4...2.5.0.0.5

bestmike007 commented 2 months ago

The script has been running for hours, no more failures. I'm closing this issue.

stacks-network / stacks-core

Requesting map/data-var/estimate-fee timeout #4931