dnsimple / erldns

DNS server, in Erlang.
MIT License
398 stars 98 forks source link

erldns_udp_server: Fix flow control #134

Open weiss opened 2 years ago

weiss commented 2 years ago

Reset the {active, N} socket option only when receiving an udp_passive message, rather than on every received packet, which provides no flow control and therefore no advantage over {active, true} mode. (BTW, resetting {active, N} actually adds N to the current count, so adding 100 on every packet could in theory overflow the counter at some point.)

weiss commented 2 years ago

I should mention that I didn't actually test this change, sorry about that. I just stumbled over the code while looking into various UDP servers written in Erlang (I was trying to figure out sane UDP socket options for a TURN server).

DXTimer commented 2 years ago

Thanks, @weiss for opening the PR. I've checked the documentation and this does seem like a good approach. We will run some tests before we accept your contribution.

m0rcq commented 1 week ago

@weiss - have you managed to test this thoroughly and performance benchmarks to compare against current implementation? If you have done this, may you please share it with us?

weiss commented 6 days ago

@m0rcq I didn't test this, sorry (I don't use erldns myself). However, I guess this might be a misunderstanding: My commit is meant to be a bug fix, not an "enhancement" of some way.

For context, gen_udp offers three modes of operation regarding flow control of incoming traffic:

  1. passive mode: You receive packets by calling gen_udp:recv/2,3.
  2. active mode: You receive packets as Erlang messages. Advantage: Waiting for traffic doesn't block your process. Disadvantage: The sender may overflow your process/memory if you don't manage to process the incoming flood in time.
  3. {active, N} mode: Tries to give you the best of both worlds. It works like this: You receive N packets as Erlang messages. The socket then switches to passive mode and sends you a udp_passive message. The idea is that you'll switch the socket back to active mode when receiving that message. This mechanism ensures the number of unprocessed UDP messages will never exceed N.

Your code is enabling {active, N} mode but then handling it incorrectly: Rather than waiting for the udp_passive message, you increase N by another 100 on each and every incoming packet. That way, the socket is never switched to passive mode, so a sender could overflow your process just like with traditional active mode.

If you really want that behavior (for max. performance), you'd just specify {active, true} (that's actually the default) and omit the inet:setopts/2 call altogether. If you want {active, N} mode (for safety), as your code suggests, you want something like my patch to fix it.

m0rcq commented 1 day ago

@m0rcq I didn't test this, sorry (I don't use erldns myself). However, I guess this might be a misunderstanding: My commit is meant to be a bug fix, not an "enhancement" of some way.

For context, gen_udp offers three modes of operation regarding flow control of incoming traffic:

  1. passive mode: You receive packets by calling gen_udp:recv/2,3.
  2. active mode: You receive packets as Erlang messages. Advantage: Waiting for traffic doesn't block your process. Disadvantage: The sender may overflow your process/memory if you don't manage to process the incoming flood in time.
  3. {active, N} mode: Tries to give you the best of both worlds. It works like this: You receive N packets as Erlang messages. The socket then switches to passive mode and sends you a udp_passive message. The idea is that you'll switch the socket back to active mode when receiving that message. This mechanism ensures the number of unprocessed UDP messages will never exceed N.

Your code is enabling {active, N} mode but then handling it incorrectly: Rather than waiting for the udp_passive message, you increase N by another 100 on each and every incoming packet. That way, the socket is never switched to passive mode, so a sender could overflow your process just like with traditional active mode.

If you really want that behavior (for max. performance), you'd just specify {active, true} (that's actually the default) and omit the inet:setopts/2 call altogether. If you want {active, N} mode (for safety), as your code suggests, you want something like my patch to fix it.

@weiss - thanks a lot for the thorough explanation, appreciated! We are discussing internally how to allocate a bit of time to benchmark this and verify erldns' behaviour under load with the suggested fix. Stay tuned, we will update the ticket once we have something to share.