PowerDNS / pdns

PowerDNS Authoritative, PowerDNS Recursor, dnsdist
https://www.powerdns.com/
GNU General Public License v2.0
3.72k stars 913 forks source link

Auth: Improve Performance for Random Subdomain Attacks with SQL Backends #9326

Open klaus-nicat opened 4 years ago

klaus-nicat commented 4 years ago

Short description

Random Subdomain Queries can kill PowerDNS easily when SQL Backends are used. This issue presents a pragmatic approach, which for sure will not save the whales, but can help lots of PowerDNS users.

Usecase

To better understand why there is an issue I will describe our problem. We currently operate ~2 mio Zones, soon much more. Currently we operate 60 public facing servers using native replication. (postgresql with logical replication). Basically everthing works fine, except when there are query patterns which look like, or are random subdomain attacks. (Actually I do not know if these are really attacks or strange query patterns). For example a few days ago we had an "attack" with 1.000.000q/s - and I want to handle this.

It seems we are in the same position as Cloudlfare was 7 years ago: PowerDNS is not suitable any more and we have to decide if we can improve PowerDNS or go some other way. Actually we want to improve PowerDNS, have ideas, have operational experience in what actually helps and what helps not, and we are willing to provide code. Of course for us it only makes sense if there is support from the community and developers to have an open ear and willingness to accept patches (for sure and happily we will listen to suggestions on how to implement things).

PowerDNS, although with slow backends, performs quite good due to its caching. The packet cache and the query cache. But caches only cache identical queries (packet cache) or identical lables (query cache). This does not help with random subdomain queries.

Description

During random subdomain attacks PowerDNS mostly suffers as every query needs at least 2 DB queries: First to find the authoritative zone, and second to get RRs for the respective label. Hence, here is a pragmatic approach to eliminate these queries.

If PowerDNS is configured with "query-cache-full-zone-ttl" (default=0=disabled), then PowerDNS will: a) Deny to start up if a backend is not compatible with this option b) Keep the domains table in memory (in the core, not in the backend). The list will be reload from the DB every query-cache-full-zone-ttl seconds in background. c) On the first query for a zone, the whole zone is loaded into the core and cached for query-cache-full-zone-ttl seconds (regardless of the RR TTLs). Hence, a zone is either complete cached or not cached.

What will be the advantages of this approach:

What is the drawback of this approach:

Why not using another backend or name server?

Why DNSDIST is the wrong answer to this problem: DNSDIST is great, and we already use it to mitigate random subdomain attacks. As soon as a domain is under attacke we provision the zone on an NSD and configure DNSDIST to forward queries to the NSD pool instead of PowerDNS pool. But this does not scale anymore, every minute some other zone is under attack. Also the always mentioned rate limiting does not help. With rate limiting we also block legitime queries and our customers notice that. We have done that and rolled back.

Why not agressive NSEC(3) caching instead of full zone caching: One idea is to implement aggressive caching of NSEC(3) to faster answer such queries. This would help during normal operation to keep cache usage low. During random subdomain attacks, the cache will not be much smaller than the full-zone cache. So honestly, maybe this would work as well as the full zone cache, but maybe is not so easy to implement.

Any feedback? We would start hacking soon ....

Thanks Klaus

klaus-nicat commented 4 years ago

Correcting myself: When saying that PDNS make 2 queries per random subdomain query, this already assumes that #9007 gets merged. Currently PDNS makes 3 queries due to the additional NS lookup.

rgacogne commented 4 years ago

I personally find the idea interesting, although I'm pretty sure there are a lot of things to consider to make it work properly. Perhaps the records cache should be disabled when that option is enabled, by the way?

Bind-backend or using Bind/NSD/Knot: I do not think that NOTIFY/XFR, slave-checks, zone provisioning scales to millions of zones

Would you mind explaining that part a bit? I see a few things that might be an issue but I wonder I'm missing some, and whether they can be fixed.

klaus-nicat commented 4 years ago

There are sury pretty things to consider when this option is enabled, ie query-cache still useful, maybe for queries to keys/metadata table? Or do they have their own cache? I guess we will find that out.

Regarding DNS based replication with NOTIFY/XFR: NOTIFYs are not reliable. Regardless if UDP (Bind), or TCP (Knot), the NOTIFY is not reliable. If the secondary is not reachable at the time of the NOTIFY, the NOTIFY is lost. The primary will not resend it after some time. Hence, although NOTIFYs will probably work 99,9% they are not reliable and the secondaries must use slave checks. I have not tried, but I guess a slave check for 5mio domains will take its time. With 50+ secondaries I think it requires lots of SOA checks and probably out-of-band jobs to check if all zones are provisioned on the secondaries and are in sync with the primary. With DB replication there is nothing to do - the DB will take care that replicas are in sync and tell me the replication lag of every node. I also had the idea of using PowerDNS+postgresql to every secondary-location, and there use Knot or NSD to XFR the zones from the local PowerDNS and serve the zones to the public. This will make provisioning and syncing more reliable (localhost instead of Internet) but of course still needs check to be sure that NSD is in sync to PowerDNS, und probably doubles RAM usage.

So I am not sure that PowerDNS+Postgresql is the best solution, but I hope to make it an acceptible solution.

klaus-nicat commented 4 years ago

How to proceed: As propsed above, I think the speedup can be split into 2 main parts:

  1. To avoid DB queries for getAuth() keep the zone list in memory in the core
  2. To avoid DB queries for every subdomain keep keep the zone in memory in the core, or use a technique similar to aggressive NSEC caching, adopted for non-DNSSEC.

Let us focus on the first problem (or shall we open multiple issues?). Our idea was, on startup to load the list of zones into RAM and reload it in background every X seconds (config option). Those list should be in a format which allows to quickly find the authoritative zone (using the longest suffix match as currently, not validating that parent zones really have the respective delegations/NS-records). The query would be probably:

select domains.id, domains.name from domains LEFT JOIN records ON records.domain_id=domains.id AND records.type='SOA' AND records.name=domains.name WHERE records.disabled=false

This takes around 7s with 2mio domains on high-end hardware. The query could probably speed up massivly by only querying the domains table (~2s) and having the disable-flag also in the domains table (ie a trigger on the records table which also updates the domains table).

I think fast backends like LMDB or bindbackend do net need this feature. If the feature would be implented in core/überbackend/gsql/gpqsgl wherever is out of my PowerDNS knowledge. So feedback and suggestions are welcome.

cFire commented 4 years ago

This would be a very useful feature for us too. Currently the best way we have to mitigate this is to "sacrifice" a zone or IP range to heavy throttling with dnsdist. Already not an ideal solution in itself, but it also won't scale to a very distributed source IP attack hitting many different zones.

v1shnya commented 2 years ago

As the first part of this request Zone-Cache has been implemented, is there any chance that the rest will be implemented too? I mean the whole zone is loaded into the core

klaus-nicat commented 2 years ago

I think first there has to be some mechanism to have query statistics per zone. Only on this counters there can be decisions which zones to regularly load completely into the cache (query cache or packet cache) - maybe similar to https://docs.powerdns.com/recursor/lua-config/ztc.html

klaus-nicat commented 1 year ago

My colleague installed metricbeat on one of our servers and so by accident we have DB query statistics in our Dashboard :-) Here are some stats of one of our servers for the last 24h.

A) SELECT content,ttl,prio,type,domain_id,disabled::int,name,auth::int FROM records WHERE disabled=false and name=$1 and domain_id=$2
554,075

B) select cryptokeys.id, flags, case when active then 1 else 0 end as active, case when published then 1 else 0 end as published, content from domains, cryptokeys where cryptokeys.domain_id=domains.id and name=$1
238,678

C) SELECT content,ttl,prio,type,domain_id,disabled::int,name,auth::int FROM records WHERE disabled=false and type=$1 and name=$2
121,350

D) select ordername, name from records where disabled=false and ordername ~<=~ $1 and domain_id=$2 and ordername is not null order by 1 using ~>~ limit 1
64,148

E) select kind,content from domains, domainmetadata where domainmetadata.domain_id=domains.id and name=$1
29,844

One reason for 'slowlyness' when using DB backends are the DB queries. B) checks if a domain is signed and fetches the keys. Maybe this query can be reduced to the signed-only zones by having the "signed-or-not" information also available in some other table? Ie. in the records table, or when using the zone-cache also in the domains table, or maybe using a join?

Regarding A) Maybe it would work to change the query to load the whole zone (removing "where name=") and filter out unneeded RRs while still having the whole zone in the DB query-cache. Maybe adding a "sort by ordername" to get rid of query D)

Regarding E) when usign zone-cache, maybe domainmetadata can be attached to the zone-cache.

I do not know it all this is feasible, but it can be the start of some brainstorming.

ghost commented 1 year ago

from a production pov I really like your idea @klaus-nicat . especially if you have thousands of domains and the random subdomain attacks are, well randomly choosen, for any random domain.

most dbs fit completely into ram these days, especially on modern servers.

it would be nice, though, if different parts of the DB could be hot reloaded in different intervals, ideally: reload every domain/record according to it's cached TTL.

any chance for something like this to get merged @rgacogne ?

I can't help coding, as I suck at c++ though :disappointed:

rgacogne commented 1 year ago

If someone writes the code, sure. I would recommend checking with us first, so we know that someone is working on it and can provide guidance.