waku-org / nwaku

Waku node and protocol.
Other
199 stars 51 forks source link

bug/regression: 'Failed to request a message push: Lightpush request has not been published to any peers. msg_hash ...' even when LightPush is successful #2955

Closed fbarbu15 closed 1 month ago

fbarbu15 commented 1 month ago

Problem

After https://github.com/waku-org/nwaku/pull/2951 was merged I get that error even when lightpush is successfull

To reproduce

  1. Have a relay node1 and a lightpush node2
  2. Lighpush message from node2
  3. Issue reproduces
  4. Check message was delivered in node1
  5. Message is found

Additional context

#!/bin/bash
printf "\nAssuming you already have a docker network called waku\n"
# if not something like this should create it: docker network create --driver bridge --subnet 172.18.0.0/16 --gateway 172.18.0.1 waku

cluster_id=3
pubsub_topic="/waku/2/rs/$cluster_id/0"
encoded_pubsub_topic=$(echo "$pubsub_topic" | sed 's:/:%2F:g')
node_1=wakuorg/nwaku:latest
node_2=wakuorg/nwaku:latest
ext_ip="172.18.204.9"
tcp_port="37344"
rest_port="37343"

printf "\nStarting containers\n"

container_id1=$(docker run -d -i -t -p 37343:37343 -p $tcp_port:$tcp_port -p 37345:37345 -p 37346:37346 -p 37347:37347 $node_1 --listen-address=0.0.0.0 --rest=true --rest-admin=true --websocket-support=true --log-level=TRACE --rest-relay-cache-capacity=100 --websocket-port=37345 --rest-port=37343 --tcp-port=$tcp_port --discv5-udp-port=37346 --rest-address=0.0.0.0 --nat=extip:$ext_ip --peer-exchange=true --discv5-discovery=true --cluster-id=$cluster_id --metrics-server=true --metrics-server-address=0.0.0.0 --metrics-server-port=37347 --metrics-logging=true --pubsub-topic=$pubsub_topic --lightpush=true --relay=true)
docker network connect --ip $ext_ip waku $container_id1

printf "\nSleeping 2 seconds\n"
sleep 2

response=$(curl -X GET "http://127.0.0.1:37343/debug/v1/info" -H "accept: application/json")
enrUri=$(echo $response | jq -r '.enrUri')

# Extract the first non-WebSocket address
ws_address=$(echo $response | jq -r '.listenAddresses[] | select(contains("/ws") | not)')

# Check if we got an address, and construct the new address with it
if [[ $ws_address != "" ]]; then
    identifier=$(echo $ws_address | awk -F'/p2p/' '{print $2}')
    if [[ $identifier != "" ]]; then
        multiaddr_with_id="/ip4/${ext_ip}/tcp/${tcp_port}/p2p/${identifier}"
    else
        echo "No identifier found in the address."
        exit 1
    fi
else
    echo "No non-WebSocket address found."
    exit 1
fi

container_id2=$(docker run -d -i -t -p 25908:25908 -p 25909:25909 -p 25910:25910 -p 25911:25911 -p 25912:25912 $node_2 --listen-address=0.0.0.0 --rest=true --rest-admin=true --websocket-support=true --log-level=TRACE --rest-relay-cache-capacity=100 --websocket-port=25910 --rest-port=25908 --tcp-port=25909 --discv5-udp-port=25911 --rest-address=0.0.0.0 --nat=extip:172.18.141.214 --peer-exchange=true --discv5-discovery=true --cluster-id=$cluster_id --pubsub-topic=$pubsub_topic --lightpush=true --relay=false --discv5-bootstrap-node=$enrUri --lightpushnode=$multiaddr_with_id)

docker network connect --ip 172.18.141.214 waku $container_id2

printf "\nSleeping 10 seconds\n"
sleep 10

printf "\nSubscribe NODE 1\n"
curl -v -X POST "http://127.0.0.1:$rest_port/relay/v1/subscriptions" -H "Content-Type: application/json" -d "[\"$pubsub_topic\"]"

printf "\nSleeping 2 seconds\n"
sleep 2

printf "\nLightpush message from NODE 2\n"                            
curl -v -X POST "http://127.0.0.1:25908/lightpush/v1/message" -H "Content-Type: application/json" -d "{\"pubsubTopic\": \"$pubsub_topic\", \"message\": {\"payload\": \"RmlsdGVyIHdvcmtzISE=\", \"contentTopic\": \"/myapp/1/latest/proto\", \"timestamp\": "$(date +%s%N)"}}"

printf "\nSleeping 1 seconds\n"
sleep 1

printf "\nCheck message in NODE 1\n"
curl -v -X GET "http://127.0.0.1:$rest_port/relay/v1/messages/$encoded_pubsub_topic" -H "Content-Type: application/json"

Logs: lightpush_node.log relay_node.log

fbarbu15 commented 1 month ago

@Ivansete-status can you please have a look at this one? Thanks

Ivansete-status commented 1 month ago

@Ivansete-status can you please have a look at this one? Thanks

Thanks for the detailed description @fbarbu15 ! The relay node complains that it doesn't have connected peers and then the message cannot be published. In this case, I suggest connecting an additional relay node to the current relay one. With that, the test should work. Let me know if I can help in whatever.

fbarbu15 commented 1 month ago

Indeed, adding a new node, fixed this problem. Thanks