waku-org / nwaku

Waku node and protocol.
Other
201 stars 53 forks source link

bug: RLNv2 rate limit imposed sooner than expected #2946

Open romanzac opened 2 months ago

romanzac commented 2 months ago

Problem

During interop test_valid_payloads_at_slow_rate execution, 20th message sent over LRN relay was rejected with error

ERROR    tests.relay.test_rln:test_rln.py:32 Payload A very long string failed: Error: 400 Client Error: Bad Request for url: http://127.0.0.1:63912/relay/v1/messages/%2Fwaku%2F2%2Frs%2F1%2F0 with response: b'Failed to publish: RLN validation failed'

Impact

Low (revised from High) occurrence, high impact.

To reproduce

  1. Please checkout https://github.com/waku-org/waku-interop-tests/pull/62/commits/6b10c1847db6b1355a4594a6fbd47e03c22dc5be
  2. cd waku-interop-tests
  3. python -m venv .venv
  4. source .venv/bin/activate
  5. pip install -r requirements.txt
  6. pre-commit install
  7. pytest tests/relay/test_rln.py -k 'test_valid_payloads_at_slow_rate'

Expected behavior

All 100 messages delivered as expected by configuration rln_relay_user_message_limit=100, rln_relay_epoch_sec=600.

Screenshots/logs

test.log node1_2024-07-31_18-06-09bb5caa43-2be1-4c92-b006-c674bb4c2814wakuorg_nwaku:latest.log node1_2024-07-31_18-06-09eba792d6-3fc8-4662-8afe-2c127d016536wakuorg_nwaku:latest.log node1_2024-07-31_18-06-2442b05617-7bc8-4300-bd23-317e2a311bfcwakuorg_nwaku:latest.log node2_2024-07-31_18-06-09eba792d6-3fc8-4662-8afe-2c127d016536wakuorg_nwaku:latest.log

nwaku version/commit hash

NWaku - wakunode2-v0.31.0-rc.1-10-ge4e01f (e4e01fabfe7265c820ce5352091e29d2fabe6506)

alrevuelta commented 2 months ago

Thanks, I managed to reproduce it with the interop tests.

The problem is here:

WRN 2024-07-31 11:59:51.671+00:00 invalid message: invalid proof topics="waku rln_relay" tid=1 file=rln_relay.nim:240 payloadLen=26

https://github.com/waku-org/nwaku/blob/v0.31.1-rc.0/waku/waku_rln_relay/rln_relay.nim#L240

I have seen everything in the past (invalid root, spam, etc) but not sure under which circumstances a proof can be invalid, assuming honest nwaku nodes. Quite interesting I can confirm it happens always at message index 20. I have even tried modifying the content of such message, but still the same problem.

Will further investigate.

alrevuelta commented 2 months ago

Seems the issue is the following:

Solution:

Changes for it to work with onchain rln aka rln-relay-dynamic=true

image
alrevuelta commented 2 months ago

If you want to have a static RLN with configurable message size, we can add a flag for this:

https://github.com/waku-org/nwaku/blob/v0.31.1-rc.0/waku/waku_rln_relay/group_manager/static/group_manager.nim#L36

(right now is hardcoded to 20)

alrevuelta commented 2 months ago

To close this issue as well:

romanzac commented 2 months ago

To close this issue as well:

  • This would have been prevented if nwaku errored when providing both a contract/eth-endpoint and using static RLN. This configuration does not make sense and should fail.

I agree with your findings. May I propose we re-qualify this issue to improvement ? In the meantime, I will change my tests to filter out unnecessary params.

romanzac commented 2 months ago

If you want to have a static RLN with configurable message size, we can add a flag for this:

https://github.com/waku-org/nwaku/blob/v0.31.1-rc.0/waku/waku_rln_relay/group_manager/static/group_manager.nim#L36

(right now is hardcoded to 20)

This proposal will self document itself. Looks good to me.

gabrielmer commented 2 months ago

So to be sure, we should:

1- return an error if a node is run with static RLN + providing a contract/endpoint 2- make static RLN message limit configurable

Is it correct? @romanzac

romanzac commented 2 months ago

So to be sure, we should:

1- return an error if a node is run with static RLN + providing a contract/endpoint 2- make static RLN message limit configurable

Is it correct? @romanzac

Yes, if static RLN is not useful for troubleshooting of dynamic RLN normally used in production. When there is an additional need to remove config params while troubleshooting, it adds work. I would suggest warning message and ignore the contract/endpoint. Plus make static RLN message limit configurable.