Closed jm-clius closed 2 years ago
I've built and pushed an image for the bridge using the Dockerfile
from nim-waku
repo called:
> d run --rm -it statusteam/nim-waku:deploy-bridge-test --help
Usage:
wakubridge [OPTIONS]...
The following options are available:
--log-level Sets the log level [=LogLevel.INFO].
...
https://hub.docker.com/r/statusteam/nim-waku/tags https://ci.status.im/job/nim-waku/job/deploy-bridge-test/
And made a PR to rename the bridge
target to wakubridge
to work with the Dockerfile
: https://github.com/status-im/nim-waku/pull/886
One question: Does the bridge any other protocol flags other than --relay
?
One question: Does the bridge any other protocol flags other than
--relay
?
No, only relay
necessary for now (which should be true
by default in any case, but good to be explicit :) )
I see no --dns4-domain-name
flag, but there is a boolean --dns-addrs
, so I'm not sure how that's supposed to work:
--dns-addrs Enable resolution of `dnsaddr`, `dns4` or `dns6`
multiaddrs [=true].
--dns-addrs-name-server DNS name server IPs to query for DNS multiaddrs
resolution. Argument may be repeated.
[=@[ValidIpAddress.init("1.1.1.1"),
ValidIpAddress.init("1.0.0.1")]].
I've deployed the hosts: https://github.com/status-im/infra-status/commit/ac96e859 And configured the nodes: https://github.com/status-im/infra-status/commit/a95bc519
admin@bridge-01.do-ams3.status.test:/docker/nim-waku-bridge % dc ps
Name Command State Ports
---------------------------------------------------------------------------------------------------------------------------------------------------
nim-waku-bridge /usr/bin/wakunode --log-le ... Up 0.0.0.0:30303->30303/tcp, 60000/tcp, 0.0.0.0:8008->8008/tcp, 127.0.0.1:8545->8545/tcp,
0.0.0.0:9000->9000/tcp
But it sure is spamming the logs a lot with:
NOT 2022-03-10 12:42:50.888+00:00 No peers for topic, skipping publish topics="libp2p gossipsub" tid=1 peersOnTopic=0 connectedPeers=0 topic=/waku/2/default-waku/proto
admin@bridge-01.do-ams3.status.test:~ % grep 'No peers' /var/log/docker/nim-waku-bridge/docker.log | wc -l
1441
That's quite a lot for just 4 minutes of node running and a NOTIFY
level message.
Next step is to connect the peers.
Thanks, Jakub. Yes, that log should disappear once we have v2 peers connected.
Unfortunately it's logged on notice
level and in an underlying library, so no straightforward way for us to suppress it (other than having the v2 peers connected.)
Since I needed some way to connect both fleet peers and the bridge to the fleet I've extracted the logic for connecting peers into a separate Ansible role and implemented most of the logic in Python, since it's faster and easier to modify:
And it appears to work as expected:
2022-03-10 16:24:30,669 [INFO] Connecting to Consul: localhost:8500
2022-03-10 16:24:30,675 [INFO] Found 5 data centers.
2022-03-10 16:24:30,678 [DEBUG] Service: bridge-01.do-ams3.status.test (env:status,stage:test,nim,waku,bridge)
2022-03-10 16:24:31,060 [INFO] Found 1 services.
2022-03-10 16:24:31,060 [INFO] Calling JSON RPC: localhost:8545
2022-03-10 16:24:31,066 [INFO] SUCCESS
Change: https://github.com/status-im/infra-status/commit/b2142fd8
Except I don't see the bridge peer in the list of connected peers for the nodes:
admin@bridge-01.do-ams3.status.test:~ % /docker/nim-waku-bridge/rpc.sh get_waku_v2_debug_v1_info | jq -c .result.listenAddresses
["/ip4/0.0.0.0/tcp/9000/p2p/16Uiu2HAmLwrpAgicPqsNtGprzGVudfFAPFT9iQ972MtVDcpn4Ucx"]
admin@node-01.gc-us-central1-a.status.test:~ % /docker/nim-waku/rpc.sh get_waku_v2_admin_v1_peers | jq '.result[].multiaddr'
"/ip4/47.242.233.36/tcp/30303/p2p/16Uiu2HAm2BjXxCp1sYFJQKpLLbPbwd5juxbsYofu3TsS3auvT9Yi"
"/ip4/64.225.81.237/tcp/30303/p2p/16Uiu2HAkukebeXjTQ9QDBeNDWuGfbaSg79wkkhK4vPocLgR6QFDf"
Oh, I see what's happening, I didn't extract the enode into the Consul service definition correctly:
admin@bridge-01.do-ams3.status.test:~ % sudo jq '.services[0].meta' /etc/consul/service_nim_waku_bridge.json
{
"node_enode": "unknown"
}
Fixed in: https://github.com/status-im/infra-status/commit/74f5ff8b
But wait a second, the get_waku_v2_debug_v1_info
call on the bridge returns multiaddress with 0.0.0.0
as IP.
admin@bridge-01.do-ams3.status.test:/docker/nim-waku-bridge % ./rpc.sh get_waku_v2_debug_v1_info | jq -c .result.listenAddresses
["/ip4/0.0.0.0/tcp/9000/p2p/16Uiu2HAmLwrpAgicPqsNtGprzGVudfFAPFT9iQ972MtVDcpn4Ucx"]
admin@bridge-01.do-ams3.status.test:/docker/nim-waku-bridge % grep extip docker-compose.yml
--nat=extip:134.209.133.76
And I'm clearly setting extip
in the --nat
flag. @jm-clius any ideas?
Fixed by just replacing the 0.0.0.0
string with the proper IP for now: https://github.com/status-im/infra-status/commit/6e200169
https://github.com/status-im/infra-status/blob/6e200169fd5005cadce3b9c8432fe3ffdc274a4e/ansible/roles/nim-waku-bridge/tasks/query.yml#L28-L32
Ok, now it looks like they are connecting:
admin@node-01.do-ams3.status.test:/docker/nim-waku % /docker/nim-waku/rpc.sh get_waku_v2_admin_v1_peers | jq '.result[].multiaddr'
"/dns4/node-01.gc-us-central1-a.status.test.statusim.net/tcp/30303/p2p/16Uiu2HAmGDX3iAFox93PupVYaHa88kULGqMpJ7AEHGwj3jbMtt76"
"/ip4/134.209.133.76/tcp/9000/p2p/16Uiu2HAmLwrpAgicPqsNtGprzGVudfFAPFT9iQ972MtVDcpn4Ucx"
"/dns4/node-01.ac-cn-hongkong-c.status.test.statusim.net/tcp/30303/p2p/16Uiu2HAm2BjXxCp1sYFJQKpLLbPbwd5juxbsYofu3TsS3auvT9Yi"
Also improved a bit the connection script:
And the logs look healthier now too:
DBG 2022-03-10 17:44:05.858+00:00 Incoming WakuRelay connection topics="wakurelay" tid=1
DBG 2022-03-10 17:44:05.858+00:00 starting pubsub read loop topics="libp2p pubsubpeer" tid=1 conn=16U*bMtt76:622a38e55379b808262b98d1 peer=16U*bMtt76 closed=false
DBG 2022-03-10 17:44:12.028+00:00 Incoming WakuRelay connection topics="wakurelay" tid=1
DBG 2022-03-10 17:44:12.029+00:00 starting pubsub read loop topics="libp2p pubsubpeer" tid=1 conn=16U*uvT9Yi:622a38eb5379b808262b98d2 peer=16U*uvT9Yi closed=false
@jm-clius Though I do wonder why I'm seeing debug messages when my log level is info
:
admin@bridge-01.do-ams3.status.test:/docker/nim-waku-bridge % grep log-level docker-compose.yml
--log-level=info
Also moved bridge setup to before node setup, since otherwise it makes no sense: https://github.com/status-im/infra-status/commit/5c37c818
Ok, prod is connected too:
admin@bridge-01.do-ams3.status.prod:~ % sudo jq '.services[0].meta' /etc/consul/service_nim_waku_bridge.json
{
"node_enode": "/ip4/161.35.244.35/tcp/9000/p2p/16Uiu2HAm1JGyYjjraM95y9wK4WFjg7k79H1xAGWGU8FTXXczjcbW"
}
admin@node-01.do-ams3.status.prod:~ % /docker/nim-waku/rpc.sh get_waku_v2_admin_v1_peers | jq '.result[].multiaddr'
"/dns4/node-02.gc-us-central1-a.status.prod.statusim.net/tcp/30303/p2p/16Uiu2HAmDQugwDHM3YeUp86iGjrUvbdw3JPRgikC7YoGBsT2ymMg"
"/dns4/node-01.ac-cn-hongkong-c.status.prod.statusim.net/tcp/30303/p2p/16Uiu2HAkvEZgh3KLwhLwXg95e5ojM8XykJ4Kxi2T7hk22rnA7pJC"
"/dns4/node-02.ac-cn-hongkong-c.status.prod.statusim.net/tcp/30303/p2p/16Uiu2HAmFy8BrJhCEmCYrUfBdSNkrPw6VHExtv4rRp1DSBnCPgx8"
"/ip4/161.35.244.35/tcp/9000/p2p/16Uiu2HAm1JGyYjjraM95y9wK4WFjg7k79H1xAGWGU8FTXXczjcbW"
"/dns4/node-02.do-ams3.status.prod.statusim.net/tcp/30303/p2p/16Uiu2HAmSve7tR5YZugpskMv2dmJAsMUKmfWYEKRXNUxRaTCnsXV"
"/dns4/node-01.gc-us-central1-a.status.prod.statusim.net/tcp/30303/p2p/16Uiu2HAkwBp8T6G77kQXSNMnxgaMky1JeyML5yqoTHRM8dbeCBNb"
I guess it's neat that bridge is the only one without a DNS name in the multiaddress so it's easy to spot.
Now, whether this works or not is an entirely separate question.
Based on prod metrics I think it works:
admin@bridge-01.do-ams3.status.prod:/docker/nim-waku-bridge % c 0:8008/metrics | grep bridge_transfers
# HELP waku_bridge_transfers Number of messages transferred between Waku v1 and v2 networks
# TYPE waku_bridge_transfers counter
waku_bridge_transfers_total{type="v1_to_v2"} 83473.0
waku_bridge_transfers_created{type="v1_to_v2"} 1646920686.0
Probably.
Also added nim-waku-bridge
to Prometheus scrape jobs: https://github.com/status-im/infra-hq/commit/a1f07555
And we have some metrics:
Great! Thanks, Jakub.
I do find it weird how the message rate for test and prod fleets is about the same:
But that might just mean that eth.test
and eth.prod
are connected and share messages.
But that might just mean that
eth.test
andeth.prod
are connected and share messages.
Yeah, I noticed this too and also assumed they carry the same traffic.
I think this is done.
The Waku v2 integration effort into Status requires deployment of a bridge between Waku v1 and Waku v2.
nim-waku
has a beta version of such a bridge which can be deployed. It helps to see the bridge as a stripped down version of both a Waku v1 and a Waku v2 client. It supports many of the same configuration options as both, usually with a-v1
or-v2
suffix to differentiate between version specific config. A tutorial for bridge installation is included in the nim-waku docs.What should be deployed?
A bridge that connects to the Waku v1 prod fleet and the (soon to be deployed?)
status.prod
fleet. This will automatically start bridging messages between the two networks. The bridge should be built (make bridge
) off the latestnim-waku
master
. To connect to a v1 network, the bridge can use either the--fleet-v1
option, e.g.or a direct staticnode config, e.g.
To connect to a v2 network the bridge uses
--staticnode-v2
with themultiaddr
of a peer inside the desired v2 network:What other config does the bridge support?
The bridge's v1 and v2 keys can be set using
--nodekey-v1:<v1-private-key-as-hex> --nodekey-v2:<v2-private-key-as-hex>
. The bridge supports v1 RPC calls and v2debug
,relay
,filter
andstore
RPC calls. The basic health-checks using RPC should therefore be possible on a bridge.