emeraldpay / dshackle

Fault Tolerant Load Balancer for Ethereum and Bitcoin APIs
Apache License 2.0
298 stars 65 forks source link

switch to fallback server #251

Closed mazeboard closed 9 months ago

mazeboard commented 1 year ago

I am testing dshackle with a primary server that only responds with HTTP status code 429, and fallback server that uses INFURA (mainnet)

I was expecting dshackle upon receiving the HTTP status 429 from the primary server to switch to the fallback server, but dshackle returns HTTP status 200 and returns the response of the primary server instead of switching to the fallback server

The primary server is implemented using ncat command as follows:

while true; do {   
  echo -n -e 'HTTP/1.1 429 Too Many Requests\r\nContent-Length: 77\r\n\r\n{"jsonrpc":"2.0","id":1,"error":{"code":-32005,"message":"too many request"}}'; 
} | ncat -l 0.0.0.0 8088; done

and the config file dshackle.yaml is:

version: v1
host: 0.0.0.0 # (1)
port: 2449
tls: # (2)
  enabled: false
proxy:
  host: 0.0.0.0 # (3)
  port: 8545
  routes:
    - id: eth
      blockchain: ethereum
access-log:
  enabled: true
  filename: /tmp/access_log.jsonl
request-log:
  enabled: true
  filename: /tmp/request_log.jsonl
monitoring:
  enabled: true
  jvm: false
  extended: false
  prometheus:
    enabled: true
    bind: 0.0.0.0
    port: 8000
    path: /status/prometheus
cluster:
  upstreams: # (4)
    - id: local-eth
      blockchain: ethereum # (5)
      role: primary
      options:
         priority: 10
         disable-validation: true
      connection:
        ethereum:
          rpc: so dshackle does not use correctly the fallback server, it returns an error instead of using the fallback (edited)# (6)
            url: "http://ankbot:8088" # (7)
          #ws:
          #  url: "wss://mainnet.infura.io/ws/v3/${INFURA_USER}"
    - id: infura-eth # (9)
      blockchain: ethereum
      role: fallback
      options:
         priority: 20
         disable-validation: true
      connection:
        ethereum:
          rpc: # (6)
            #url: "https://ankbot:8088"  
            url: "https://mainnet.infura.io/v3/${INFURA_USER}" # (7)
          #ws:
          #  url: "wss://mainnet.infura.io/ws/v3/${INFURA_USER}"

Thanks in advance for your help

splix commented 1 year ago

You have disable-validation: true which disables all the checks. Can you please try without this option?

splix commented 1 year ago

Though it doesn't seem right anyway. I will check it because disabling the validation should not cause this

splix commented 9 months ago

Sorry for the delay, but this one was a bit tricky.

To provide some context. First, I want to clarify that 'Primary' and 'Fallback' means upstream availability, not the responses they provide. An 'error' in the response doesn't necessarily mean that the entire request has failed. This is because an 'error' response can be valid for some other types of calls. However, I believe that HTTP status 429 always means that something is wrong, and the response body should not be considered as an actual response.

In fact, previously Dshackle was doing exactly what you expect. But it turns out some servers may respond with 500 or 404 in additional to normal JSON RPC response, so Dshackle started to accept those responses as valid if the body is a valid JSON RPC message.

I agree that status code 429 must be treated as a connection error, and any response should be disregarded. This is what the fix is doing. But this logic applies only to situations like yours, where there is another valid upstream source available. Otherwise it provides it back as is, because he has no other option.