containers / aardvark-dns

Authoritative dns server for A/AAAA container records. Forwards other request to host's /etc/resolv.conf
Apache License 2.0
184 stars 32 forks source link

add some basic perf check script #480

Closed Luap99 closed 2 months ago

Luap99 commented 3 months ago

Using podman spawn some containers and then check the performance from aardvark-dns.

Comparing results from now after the rework with v1.11 I can see significant gains. While the total time is about the same the new version only uses around 2/3 of the cycles which means cpu utilization is much better now.

openshift-ci[bot] commented 3 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/containers/aardvark-dns/blob/main/OWNERS)~~ [Luap99] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
Luap99 commented 3 months ago

new:

 Performance counter stats for process id '1449032':

          3,897.06 msec task-clock:u                     #    0.973 CPUs utilized             
                 0      context-switches:u               #    0.000 /sec                      
                 0      cpu-migrations:u                 #    0.000 /sec                      
                40      page-faults:u                    #   10.264 /sec                      
     6,829,395,289      cycles:u                         #    1.752 GHz                       
     6,851,457,495      instructions:u                   #    1.00  insn per cycle            
     1,395,143,347      branches:u                       #  357.999 M/sec                     
        12,815,753      branch-misses:u                  #    0.92% of all branches           

       4.003890371 seconds time elapsed

old (v1.11.0):


 Performance counter stats for process id '1448088':

          4,730.64 msec task-clock:u                     #    1.182 CPUs utilized             
                 0      context-switches:u               #    0.000 /sec                      
                 0      cpu-migrations:u                 #    0.000 /sec                      
                32      page-faults:u                    #    6.764 /sec                      
     9,741,298,492      cycles:u                         #    2.059 GHz                       
    14,255,917,727      instructions:u                   #    1.46  insn per cycle            
     2,969,731,831      branches:u                       #  627.765 M/sec                     
        24,753,091      branch-misses:u                  #    0.83% of all branches           

       4.003881880 seconds time elapsed

I run it several times and the results were pretty consistent.

Luap99 commented 3 months ago

@mheon PTAL (low prio)

mheon commented 3 months ago

Seems like network latency probably dominates the wall time...

Suggestion: Time the first one separately, as it shouldn't have a full DNS cache.

Luap99 commented 3 months ago

Time the first one separately, as it shouldn't have a full DNS cache.

What cache? I don't see how I can time the first one differently.

mheon commented 3 months ago

The coredns server should be caching, no? We'll store the result locally and respond based on it within the limits of the TTL.

mheon commented 3 months ago

(If it's not that's a massive potential performance optimization right there...)

Luap99 commented 3 months ago

The coredns server should be caching, no? We'll store the result locally and respond based on it within the limits of the TTL.

There is no cache at all AFAICT and we only measure internal names here so it goes though the normal map lookup each time.

Luap99 commented 3 months ago

benchmarking aardvark-dns as isolated DNS server

I mean this sort of does that. I don't know how you would test isolated? You need a client sending requests, whenever this is a podman container or not seems irrelevant for the most part. Of course the total time will be different as we do not have to overhead of podman starting/stopping containers but honestly I do not care about that at all.

So far this seems more than good enough for a quick check which is all I care about. I just needed to know whenever my changes make it better or worse. This should not be considered an actual benchmark.

flouthoc commented 3 months ago

So far this seems more than good enough for a quick check which is all I care about. I just needed to know whenever my changes make it better or worse. This should not be considered an actual benchmark.

Yes I agree this looks good for the quick check.

I don't know how you would test isolated

Initially when we were developing aardvark I remember checking it e2e behavior without podman , If I find that hacky arrangement on my system then we can use those bits to measure throughput and other metrics. But lets leave it for another day if we ever decide to have raw bechmarks for aardvark-dns.

Overall exisiting perf in this PR LGTM.

baude commented 2 months ago

/lgtm