PowerDNS / pdns

PowerDNS Authoritative, PowerDNS Recursor, dnsdist
https://www.powerdns.com/
GNU General Public License v2.0
3.67k stars 906 forks source link

Excessive disk writes while using LMDB backend #13024

Open aldem opened 1 year ago

aldem commented 1 year ago

Short description

While checking slave zones (LMDB backend) pdns produces disk write activity with a volume that is several times more that database size.

In my database there are ~250K domains and ~2M records, thus full check of zones could take time. However, it is a bit unexpected that checking the zones could produce a lot of disk writes, even taking into account that check timestamp is updated - the whole database consumes only ~245M while amount of data written in a single zone check cycle is ~6G - this is 24 times (!) more.

The server is idling (no other activity) so this behaviour could not be attributed to some other activity. Zone check results in ca. 70 zones transferred due to updated serials - and those are anyway quite small zones.

This is how it looks in netdata:

image

It happened several times already i.e. this is not a one time event - and every time in log file just a few (<100) zone AXFRs, nothing more. Depending on sync mode the rate could be around 25 MiB/s and it will take less time, but the volume stays the same.

Environment

Steps to reproduce

  1. Just install auth server with similar number of domains and records (250K & ~2M).
  2. Let it idle until zone check cycle.

Expected behaviour

I would expect that zone updates produce only really necessary amount of disk writes, not on gigabyte scale when database itself is only 1/4th of the gigabyte.

Actual behaviour

Disk writes are too excessive without any reason.

Other information

I am not sure if this is a problem (if at all) of pdns itself (transaction/open modes etc) or LMDB, but in any case there should be a way around this.

LMDB is the fastest backend and it would be really pity to change it only because of this behaviour. With sqlite3 backend I didn't see such behaviour.

mind04 commented 1 year ago

12888 is likely related

aldem commented 1 year ago

Short update: in nosync mode there are no excessive writes, but this is bad idea as it could lead to database corruption.

My guess is that it happens because pdns wraps every single operation (like updating last check time) into transaction and every transaction has significant write amplification.

The way around this might be to aggregate transactions, either by time or by number of operations - this way in case of failure we could lose some changes but database should not be corrupted at least, and I suspect that recovery from corruption is far worse that repeating not applied transactions.

Other backends that support transactions could also benefit from aggregation.