Without the --net host option, Docker Besu cannot find peers

hardice501 commented 1 year ago

Description

We are Test Besu 21.10.9 Docker, IBFT2 consensus protocol, on a private network, gas price free with 4 validators.

4 Validators on 1 machine is no problem.

But build docker on each of 4 computers with different ip, bootnode cannot found each peers.

Acceptance Criteria

Steps to Reproduce (Bug)

By ibftConfigFile.json, make genesis file and key file in first Computer
1 Validator(also bootnode) setup to first Computer without --net host option,
Copy and paste the Genesis file and key file to the second computer
Run second Computer's Docker without --net host option
ping test in docker container

Expected behavior: [What you expect to happen] Successfully connected peers Successfully Ping test Actual behavior: [What actually happens] No connection between peers. Successfully Ping test Frequency: [What percentage of the time does it occur?] Always except when adding the --net host option Ping test always succeeds even without --net host option

Logs (if a bug)

Please post relevant logs from Besu (and the consensus client, if running proof of stake) from before and after the issue. FullSyncTargetManager | No sync target, waiting for peers: 0

Versions (Add all that apply)

Software version: 21.10.9(Docker image)
Java version: just used Docker
OS Name & Version: ISTRIB_ID=Ubuntu DISTRIB_RELEASE=22.04 DISTRIB_CODENAME=jammy DISTRIB_DESCRIPTION="Ubuntu 22.04.2 LTS" PRETTY_NAME="Ubuntu 22.04.2 LTS" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04.2 LTS (Jammy Jellyfish)" VERSION_CODENAME=jammy ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
Kernel Version: Linux besu-try-testnet-node-1 5.4.0-139-generic #156-Ubuntu SMP Fri Jan 20 17:27:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Virtual Machine software & version: [vmware -v]
Docker Version: Client: Docker Engine - Community Version: 24.0.2 API version: 1.43 Go version: go1.20.4 Git commit: cb74dfc Built: Thu May 25 21:51:00 2023 OS/Arch: linux/amd64 Context: default

Additional Information (Add any of the following or anything else that may be relevant)

Besu setup info genesis file:

{
"config" : {
"chainId" : 1337,
"berlinBlock" : 0,
"ibft2" : {
  "blockperiodseconds" : 2,
  "epochlength" : 30000,
  "requesttimeoutseconds" : 4
},
"contractSizeLimit" : 2147483647
},
"nonce" : "0x0",
"timestamp" : "0x58ee40ba",
"gasLimit" : "0x1fffffffffffff",
"difficulty" : "0x1",
"mixHash" : "0x63746963616c2062797a616e74696e65206661756c7420746f6c6572616e6365",
"coinbase" : "0x0000000000000000000000000000000000000000",
"alloc" : {
"fe3b557e8fb62b89f4916b721be55ceb828dbd73" : {
  "privateKey" : "8f2a55949038a9610f50fb23b5883af3b4ecb3c3bb792cbcefbd1542c692be63",
  "comment" : "private key and this comment are ignored.  In a real chain, the private key should NOT be stored",
  "balance" : "0xad78ebc5ac6200000"
},
"627306090abaB3A6e1400e9345bC60c78a8BEf57" : {
  "privateKey" : "c87509a1c067bbde78beb793e6fa76530b6382a4c0241e5e4a9ec0a0f44dc0d3",
  "comment" : "private key and this comment are ignored.  In a real chain, the private key should NOT be stored",
  "balance" : "90000000000000000000000"
},
"f17f52151EbEF6C7334FAD080c5704D77216b732" : {
  "privateKey" : "ae6ae8e5ccbfb04590405997ee2d52d2b330726137b875053c36d94e974d162f",
  "comment" : "private key and this comment are ignored.  In a real chain, the private key should NOT be stored",
  "balance" : "90000000000000000000000"
}
},
"extraData" : "0xf87ea00000000000000000000000000000000000000000000000000000000000000000f854945ad06e82e508711d9b64632a6702edf60dbb840f94fe605c77ab9f2ba80c102e2ee734a803df6c7da994269f2e8e1924be1f27744f56094ad8c9d6884ef29493d19121054b420c566b87a8ce92fcb19fd20528808400000000c0"

config option (if not boot_node, include option --bootnodes=${BOOT_NODE_ENODE}):

docker create --name boot_node -p 8545:8545 -p 8546:8546 -p 30303:30303 hyperledger/besu:21.10.9  --genesis-file=/genesis.json --rpc-http-enabled --rpc-http-api=ETH,NET,IBFT --rpc-http-cors-origins="all" --rpc-http-host=0.0.0.0 --rpc-ws-enabled --rpc-ws-host=0.0.0.0 --rpc-ws-apis=ADMIN,ETH,MINER,WEB3,NET,PRIV,EEA --host-allowlist="*" --min-gas-price=0
docker cp genesis.json boot_node:/genesis.json
docker cp Node-1/data/key boot_node:/opt/besu/key
docker start boot node

System info - memory, CPU

hardice501 commented 1 year ago

update issue in my cases, add option -p 30303:30303/udp is working at linux.

but in mac(m1), must need to build a virtual environment to run the docker cli.

paid license(docker desktop, OrbStack) is working at M1

// the other are free license

using program based on lima (lima, colima, rancher desktop): udp port is not enable
using podman: p2p connection is possible only with boot_node. (If not specify MAC address, podman does not send the host's ip, but the url for the virtualized ip.)

in boot_node

> admin.peers
{
  firstEnode: {
     localhost: [virtual ip]:30303,
     remotehost: [virtual ip]:{random port},
 },
 secondEnode:{
   localhost: [virtual ip]:30303,
   remotehost: [virtual ip]:{random port},
 }
}

so each peers(ex: first and second p2p connection) cannot connect.

using multipass + docker-cli: can find peer(but must be assigned two physical IPs.)

non-fungible-nelson commented 1 year ago

Thanks for the update - Docker on Mac has some limitations with Localhost, recommend testing/dev with --net.

Definitely also recommend updating to the latest version and trying as well. 23.4.4 Not sure if this version of Besu (21.10.9) did not have specific M1 support in Besu. A newer version might play nicer with Docker.

hardice501 commented 1 year ago

Thanks for the update - Docker on Mac has some limitations with Localhost, recommend testing/dev with --net.

Definitely also recommend updating to the latest version and trying as well. 23.4.4 Not sure if this version of Besu (21.10.9) did not have specific M1 support in Besu. A newer version might play nicer with Docker.

Thanks for the reply. A nonce error always occurred when performing a load test on besu version 21.10.9 later. Do you know anything about this?

non-fungible-nelson commented 1 year ago

We have tweaked nonce behaviors a lot to prevent Denial of Service attacks on Mainnet. Newer versions of Besu should have fixes for this and we even have a new transaction pool type that can handle "future nonce" transactions.

This flag has more details: https://besu.hyperledger.org/stable/public-networks/reference/cli/options#tx-pool-limit-by-account-percentage

hardice501 commented 1 year ago

We have tweaked nonce behaviors a lot to prevent Denial of Service attacks on Mainnet. Newer versions of Besu should have fixes for this and we even have a new transaction pool type that can handle "future nonce" transactions.

This flag has more details: https://besu.hyperledger.org/stable/public-networks/reference/cli/options#tx-pool-limit-by-account-percentage

I try that(ex tx-pool-limit-by-account-percentage=0.9) and try https://github.com/hyperledger/besu/pull/5290 this also. but nonce error always occurred.(if sended TPS is 1000). How can i solve it?

non-fungible-nelson commented 1 year ago

Have you tried setting the limit to 1 (to allow all future nonce transactions)?

tx-pool-limit-by-account-percentage=1

If you are in a private network with known senders, this should stop the nonce gap error you're seeing.

If this doesn't work. @fab-10 might have some insight on how to avoid this nonce issue.

fab-10 commented 1 year ago

@hardice501 which is your block time and block gas limit? since your TPS is quite high, you need some more tuning:

if using the new layered txpool then increase this value --Xlayered-tx-pool-max-future-by-sender > TPS block-time 2.
if using the default txpool then keep --tx-pool-limit-by-account-percentage as close to 1 as possible and set --tx-pool-max-size > TPS block-time 2.

Let me know if these options solve the nonce issue, otherwise we can debug further.

hardice501 commented 1 year ago

What's mean of TPSblock-time2 ??

my besu options is below.

besu --node-private-key-file=/Users/songsanghyeon/work/Constructor-Besu-IBFT/Node-1/key --rpc-http-port=8545 --rpc-ws-port=8546 --p2p-port=30303 --genesis-file=/Users/songsanghyeon/work/Constructor-Besu-IBFT/genesis.json --data-path=/Users/songsanghyeon/work/Constructor-Besu-IBFT/Node-1/database --rpc-http-max-active-connections=1000 --rpc-http-enabled --rpc-http-apis=ETH,NET,QBFT,ADMIN,PRIV,EEA,MINER,WEB3,TXPOOL,DEBUG,TRACE --rpc-http-cors-origins=all --rpc-ws-enabled --rpc-ws-host=0.0.0.0 --rpc-ws-apis=ETH,NET,QBFT,ADMIN,PRIV,EEA,MINER,WEB3,TXPOOL,DEBUG --min-gas-price=0 --rpc-ws-max-frame-size=104857600 --Xlayered-tx-pool-layer-max-capacity=50000000000 --Xlayered-tx-pool-max-prioritized=160000 --Xlayered-tx-pool-max-future-by-sender=160000 --host-allowlist="*"

caliper options and genesis options: blockperiodseconds: 1 Using tool: Caliper-benchmark (websocket)

simpleArgs: &simple-args initialMoney: 10000 moneyToTransfer: 100 numberOfAccounts: &number-of-accounts 10

test: name: simple description: >- This is an example benchmark for Caliper, to test the backend DLT's performance with simple account opening & querying transactions. workers: number: 3 rounds:

label: transfer description: Test description for transfering money between accounts. txNumber: 10000 rateControl: type: fixed-rate opts: tps: 300 workload: module: benchmarks/scenario/simple/transfer.js arguments: << : *simple-args money: 100

java heap size : java -XX:+PrintFlagsFinal -version 2>&1 | grep -i -E 'heapsize|metaspacesize|version'
size_t ErgoHeapSizeLimit = 0 {product} {default} size_t HeapSizePerGCThread = 43620760 {product} {default} size_t InitialHeapSize = 10737418240 {product} {command line} size_t LargePageHeapSizeThreshold = 134217728 {product} {default} size_t MaxHeapSize = 17179869184 {product} {ergonomic} size_t MaxMetaspaceSize = 18446744073709551615 {product} {default} size_t MetaspaceSize = 22020096 {product} {default} size_t MinHeapSize = 10737418240 {product} {command line} uintx NonNMethodCodeHeapSize = 5839564 {pd product} {ergonomic} uintx NonProfiledCodeHeapSize = 122909338 {pd product} {ergonomic} uintx ProfiledCodeHeapSize = 122909338 {pd product} {ergonomic} size_t SoftMaxHeapSize = 17179869184 {manageable} {ergonomic}

also I already try to default txpool option.

besu --node-private-key-file=/Users/songsanghyeon/work/Constructor-Besu-IBFT/Node-1/key --rpc-http-port=8545 --rpc-ws-port=8546 --p2p-port=30303 --genesis-file=/Users/songsanghyeon/work/Constructor-Besu-IBFT/genesis.json --data-path=/Users/songsanghyeon/work/Constructor-Besu-IBFT/Node-1/database --rpc-http-max-active-connections=1000 --rpc-http-enabled --rpc-http-apis=ETH,NET,QBFT,ADMIN,PRIV,EEA,MINER,WEB3,TXPOOL,DEBUG,TRACE --rpc-http-cors-origins=all --rpc-ws-enabled --rpc-ws-host=0.0.0.0 --rpc-ws-apis=ETH,NET,QBFT,ADMIN,PRIV,EEA,MINER,WEB3,TXPOOL,DEBUG --min-gas-price=0 --rpc-ws-max-frame-size=10485760 --tx-pool-limit-by-account-percentage=1 --tx-pool-max-size=16000 --host-allowlist="*" --graphql-http-enabled=true --revert-reason-enabled=true

But, both of options cannot work by nonce error. only 21.10.9 version is working.

fab-10 commented 1 year ago

TPSblock-time2 is actually TPS block-time 2, markdown interpreted the *

in the first command, you are missing the --Xlayered-tx-pool=true option, so you are still using the legacy pool, try this command leaving the defaults value and see the output

besu --node-private-key-file=/Users/songsanghyeon/work/Constructor-Besu-IBFT/Node-1/key --rpc-http-port=8545 --rpc-ws-port=8546 --p2p-port=30303 --genesis-file=/Users/songsanghyeon/work/Constructor-Besu-IBFT/genesis.json --data-path=/Users/songsanghyeon/work/Constructor-Besu-IBFT/Node-1/database --rpc-http-max-active-connections=1000 --rpc-http-enabled --rpc-http-apis=ETH,NET,QBFT,ADMIN,PRIV,EEA,MINER,WEB3,TXPOOL,DEBUG,TRACE --rpc-http-cors-origins=all --rpc-ws-enabled --rpc-ws-host=0.0.0.0 --rpc-ws-apis=ETH,NET,QBFT,ADMIN,PRIV,EEA,MINER,WEB3,TXPOOL,DEBUG --min-gas-price=0 --rpc-ws-max-frame-size=104857600 --host-allowlist="*" --Xlayered-tx-pool=true

hardice501 commented 1 year ago

Thank u for comment.

But I have same problem in this options.(add --Xlayered-tx-pool=true) my besu version is latest. besu --node-private-key-file=/Users/songsanghyeon/work/Constructor-Besu-IBFT/Node-1/key --rpc-http-port=8545 --rpc-ws-port=8546 --p2p-port=30303 --genesis-file=/Users/songsanghyeon/work/Constructor-Besu-IBFT/genesis.json --data-path=/Users/songsanghyeon/work/Constructor-Besu-IBFT/Node-1/database --rpc-http-max-active-connections=1000 --rpc-http-enabled --rpc-http-apis=ETH,NET,QBFT,ADMIN,PRIV,EEA,MINER,WEB3,TXPOOL,DEBUG,TRACE --rpc-http-cors-origins=all --rpc-ws-enabled --rpc-ws-host=0.0.0.0 --rpc-ws-apis=ETH,NET,QBFT,ADMIN,PRIV,EEA,MINER,WEB3,TXPOOL,DEBUG --min-gas-price=0 --rpc-ws-max-frame-size=104857600 --Xlayered-tx-pool-layer-max-capacity=50000000000 --Xlayered-tx-pool-max-prioritized=160000 --Xlayered-tx-pool-max-future-by-sender=160000 --host-allowlist="*" --graphql-http-enabled=true --revert-reason-enabled=true --Xlayered-tx-pool=true

and when "Failed tx on simple calling method transfer nonce" error occured(some transaction is pended), Even if lower the TPS to 10, the same error continues to occur.

How to handling nonce error problem(high TPS) and pended problem(after nonce error)?

fab-10 commented 1 year ago

At this point, I think I need to try to reproduce your test locally.

hyperledger / besu