Closed zone117x closed 1 year ago
I found this error is triggered on this line: https://github.com/stacks-network/stacks-blockchain/blob/7d960260b525d270e97bbafc8f09bd77d883b37b/src/net/rpc.rs#L420
This get_ancestor_sort_id
call returns db_error:NotFoundError
https://github.com/stacks-network/stacks-blockchain/blob/7d960260b525d270e97bbafc8f09bd77d883b37b/src/chainstate/burn/db/sortdb.rs#L3488-L3490
I'm not sure what the fix is here. Perhaps a mismatch between the reward start height and sortition ID?
Also of note: I think the is_pox_active: cur_cycle_pox_active
value from the above line (and the parent object current_cycle
) is the only part of this function that doesn't respect the tip
parameter. Although, I'm not sure if a non-default tip
param is explicitly supported in other parts of this /v2/pox
fn.
This can be reproduced if you hit up /v2/pox
aggressively during when a sortition is being processed.
Related: don't use the tip
argument at all since the sortition DB won't honor it. Also, see if we can get a "read transaction" for querying all the state we need in one query.
@jcnelson accidental close?
yup, fat-fingered
I think this is fixed in next
if you wanna try it again.
Unfortunately it doesn't appear to be fixed. Running into the same problem (occurs at the same frequency as originally reported).
Pulling in the temp work-around from https://github.com/stacks-network/stacks-blockchain/pull/3281 still works in preventing the 500 responses (it just ignores errors from that sortdb.is_pox_active(burnchain, &burnchain_tip)
line.
Something is up with that is_pox_active
function, I've gone down a few rabbit holes trying to debug why but no luck.
Does this happen if v1_unlock_height
is the same as epoch 2.1's start_height
?
Does this happen if
v1_unlock_height
is the same as epoch 2.1'sstart_height
?
I'm actually unable to get the node to mine the epoch 2.1 start block when setting [burnchain.pox_2_activation]
to the same value as [[burnchain.epochs]]
epoch 2.1 start_height
.
I get the following error message repeating:
ERRO [1667850564.225276] [testnet/stacks-node/src/neon_node.rs:1774] [miner-block-http://0.0.0.0:20443] Relayer: Failure fetching recipient set: ChainstateError(ClarityError(Interpreter(Unchecked(NoSuchContract("ST000000000000000000002AMW42H.pox-2")))))
Here's the config I'm using:
[node]
name = "krypton-node"
rpc_bind = "0.0.0.0:20443"
p2p_bind = "0.0.0.0:20444"
working_dir = "/chainstate/stacks-blockchain-data"
miner = true
use_test_genesis_chainstate = true
pox_sync_sample_secs = 1
wait_time_for_blocks = 0
wait_time_for_microblocks = 50
microblock_frequency = 1000
[miner]
first_attempt_time_ms = 30000
subsequent_attempt_time_ms = 5000
[connection_options]
disable_block_download = true
disable_inbound_handshakes = true
disable_inbound_walks = true
public_ip_address = "1.1.1.1:1234"
[burnchain]
chain = "bitcoin"
mode = "krypton"
poll_time_secs = 1
peer_host = "localhost"
peer_port = 18444
rpc_port = 18443
rpc_ssl = false
username = "btc"
password = "btc"
timeout = 30
pox_2_activation = 105 # <---- mining error when set to the same value as epoch 2.1 start height
[[burnchain.epochs]]
epoch_name = "1.0"
start_height = 0
[[burnchain.epochs]]
epoch_name = "2.0"
start_height = 103
[[burnchain.epochs]]
epoch_name = "2.05"
start_height = 104
[[burnchain.epochs]]
epoch_name = "2.1"
start_height = 105
I also get the same error when bumping it to one block higher. In the above config, I need to set pox_2_activation
to 107
(2 higher than the epoch21 start height) in order for the miner to progress.
It also works if I set both the epoch21 start height and pox_2_activation
to 107
.
Is pox_2_activation
rounding down to the height of the previous pox cycle?
@zone117x i'm not sure what the discrepancy is, but I saw it work with
[node]
name = "krypton-node"
rpc_bind = "0.0.0.0:20443"
p2p_bind = "0.0.0.0:20444"
# working_dir = "$DATA_DIR"
seed = "9e446f6b0c6a96cf2190e54bcd5a8569c3e386f091605499464389b8d4e0bfc201"
local_peer_seed = "9e446f6b0c6a96cf2190e54bcd5a8569c3e386f091605499464389b8d4e0bfc201"
miner = true
use_test_genesis_chainstate = true
pox_sync_sample_secs = 5
wait_time_for_blocks = 0
wait_time_for_microblocks = 1000
microblock_frequency = 5000
# mine_microblocks = true
# max_microblocks = 10
#
[[events_observer]]
endpoint = "localhost:50303"
retry_count = 255
events_keys = ["*"]
[miner]
first_attempt_time_ms = 30000
subsequent_attempt_time_ms = 5000
block_reward_recipient = "ST9V4MGBMGHX0WR3MG6JA53HNKMJ2HF9VWRQNTKR"
[connection_options]
# inv_sync_interval = 10
# download_interval = 10
# walk_interval = 10
disable_block_download = true
disable_inbound_handshakes = true
disable_inbound_walks = true
public_ip_address = "1.1.1.1:1234"
[burnchain]
chain = "bitcoin"
mode = "krypton"
poll_time_secs = 1
pox_2_activation = 105
### bitcoind-regtest connection info
peer_host = "127.0.0.1"
peer_port = 18444
rpc_port = 18443
rpc_ssl = false
username = "krypton"
password = "krypton"
timeout = 30
[[burnchain.epochs]]
epoch_name = "1.0"
start_height = 0
[[burnchain.epochs]]
epoch_name = "2.0"
start_height = 103
[[burnchain.epochs]]
epoch_name = "2.05"
start_height = 104
[[burnchain.epochs]]
epoch_name = "2.1"
start_height = 105
The problem is some block height configs (seemingly arbitrarily) don't work. My guess is something related to pox_2_activation
needing to be on a pox cycle boundary relative to one of the start_height
configs.
Also this thread is getting a bit off topic (although the latest issue we are discussing might be related to the initial issue idk). The focus of this issue to fix the HTTP 500 responses returned from GET /v2/pox
.
Should I open a new issue for Miner halts with error for certain block height configs
?
cc @jcnelson ?
Should I open a new issue for Miner halts with error for certain block height configs?
Yes, let's do that.
The /v2/pox
HTTP 500 error was fixed by @jcnelson in https://github.com/stacks-network/stacks-blockchain/pull/3399
I'm closing this issue, and opening a new one for the Miner halts with error for certain block height configs
issue discussed above.
Running the latest
next
branch in mocknet mode, and theGET /v2/pox
endpoint often returns an HTTP 500 error with the message:Failed to query peer info
. On my machine, there's around a 2 second window of time where the 500 errors are returned.It's easier to reproduce this issue with a short mocknet block time. So perhaps some race condition that happens around block assembly / broadcast time?
Here's a snippet of logs during a 500 request, with
STACKS_LOG_DEBUG=1
: stacks-node-500.log.txtIt looks like these are the relevant lines:
Using config.toml to bootstrap to epoch 2.1: