Closed ci-work closed 2 years ago
related: https://github.com/helium/erlang-libp2p/pull/429 -etl generates a large amount of ARP requests via the gateways status updated process, increasing connections considerably and asking a limited set of random connections improves overall success without having negative network impact / filling connected devices throttle limits
related: helium/erlang-libp2p#429 -etl generates a large amount of ARP requests via the gateways status updated process, increasing connections considerably and asking a limited set of random connections improves overall success without having negative network impact / filling connected devices throttle limits
does etl actually? The new "online" is really only to check for on chain activity since everything else is so variable. What's still causing ARPs?
@madninja yeah, up to 200 per minute
or whatever MAX_REQUEST_RATE
is set to in gateway status
default ARP limit is 10 per minute
so it can hit ARP throttle on connected peers in ~9 seconds (usually much quicker) - it makes a surprisingly high number of arp requests even at the default 200 per minute
we were running w/ 600 per minute for a while with 300 connected peers, the split/shuffle connection code, and configured to hit our own nodes 50% of the time which have unlimited arp responses, gossip just doesn't get around fast enough so etl is basically constantly asking for new information, if you stick in some crude logging you'd be shocked at the number of ARP requests it makes
but where in the gateway_status code is it actually doing ARP requests?
https://github.com/helium/blockchain-etl/blob/master/src/be_db_gateway_status.erl#L264
case be_peer_status:peer_stale(Address, PeerBook, true) of
calls: https://github.com/helium/blockchain-etl/blob/master/src/be_peer_status.erl#L74
peer_stale(Address, PeerBook, Refresh) ->
case libp2p_peerbook:get(PeerBook, Address) of
{ok, Peer} ->
case libp2p_peer:is_stale(Peer, ?STALE_PEER_TIME) of
true when Refresh ->
libp2p_peerbook:refresh(PeerBook, Address),
and that refresh does the ARP request
aw shoot, I thought I'd squashed that. I need to reconcile this with the http change that returns online status based on chain activity vs that check there
need to revert this asap @madninja the coalesce is required to that gateways with no status are added to the list of gateways to get status for without it, anything missing in gateway_status that is present in gateway_inventory never gets added or put another way, the where clause then invalidates the left join
remove coalesce so indexes are used, updated_at is already not null + default now() so the coalesce to default to_timestamp(0) isn't needed