hickory-dns / hickory-dns

A Rust based DNS client, server, and resolver
Other
4.09k stars 467 forks source link

[Performance] HickoryDNS loses nearly 50% of queries #2613

Open mokeyish opened 1 day ago

mokeyish commented 1 day ago

Describe the bug

I made a simple DNS performance test repository and obtained the most popular 1 million domain names as a test set.

Then it was discovered that HickoryDNS actually lost nearly 50% of the queries.

To Reproduce

https://github.com/mokeyish/dnsperf-testing

Steps to reproduce the behavior:

Expected behavior

Fix this performance issue.

System:

Version: Crate: server Version: https://github.com/hickory-dns/hickory-dns/commit/001ced84c37a3f1b1c43e1244df0ba436ffcb9c5

marcus0x62 commented 1 day ago

Hi @mokeyish,

Running the same git commit as you, this is what I see against 1.1.1.1:

Statistics:

  Queries sent:         9999
  Queries completed:    9952 (99.53%)
  Queries lost:         47 (0.47%)

  Response codes:       NOERROR 9855 (99.03%), NXDOMAIN 97 (0.97%)
  Average packet size:  request 30, response 78
  Run time (s):         28.946179
  Queries per second:   343.810490

  Average Latency (s):  0.229788 (min 0.011245, max 4.347742)
  Latency StdDev (s):   0.340956

and 8.8.8.8:

Statistics:

  Queries sent:         9999
  Queries completed:    9992 (99.93%)
  Queries lost:         7 (0.07%)

  Response codes:       NOERROR 9895 (99.03%), NXDOMAIN 97 (0.97%)
  Average packet size:  request 30, response 77
  Run time (s):         14.472769
  Queries per second:   690.400020

  Average Latency (s):  0.126741 (min 0.018369, max 4.703716)
  Latency StdDev (s):   0.213172

Most of the few query failures I did see went away with a longer timeout (I was getting late reply messages in a few cases.)

So, with DNS servers that are reasonably close to me, I can't duplicate the behavior you are seeing. If you can reproduce this readily, trace logs from hickory would be helpful.

mokeyish commented 1 day ago

@marcus0x62 Are your two directly testing 1.1.1.1 and 8.8.8.8?

One of my tests is to test 119.29.29.29 directly, and the other is to start HickoryDNS and set the upstream to 119.29.29.29.

marcus0x62 commented 1 day ago

@marcus0x62 Are your two directly testing 1.1.1.1 and 8.8.8.8?

No, the output of dnsperf in my previous message shows requests to Hickory, configured to forward to 1.1.1.1 and 8.8.8.8. I used the same forwarding configuration as in your issue report.

mokeyish commented 1 day ago

Here are my logs from testing HickoryDNS (left) and dnsperf (right).

image

mokeyish commented 1 day ago

Your data is so close, is it because my computer performance is relatively low? I am running it in WSL.

mokeyish commented 21 hours ago

@marcus0x62 I ran it on cloudcone and got exactly the same result as yours.

marcus0x62 commented 8 hours ago

I think I'd need to see the full log output to venture any sort of guess as to what's going on. Trying running your server with:

RUST_LOG=trace cargo run \
               --all-features \
               --bin hickory-dns \
               -- \
                  --debug -p 8054 \
                 --config tests/test-data/test_configs/example_forwarder.toml --debug

and attach the full log file.