sys4 / tlsrpt

A set of libraries and tools to implement TLSRPT reporting into an MTA and to generate and submit TLSRPT reports.
0 stars 0 forks source link

Performance/reliability considerations for TLSRPT internal storage #2

Open wietse-postfix opened 8 months ago

wietse-postfix commented 8 months ago

This note is based on "TLSRPT for MTAs" Version 0.01. I summarize my understanding of the global architecture, present ball-park performance numbers, and make suggestions for the internal storage.

Over-all architecture

Performance and reliability considerations

A high-performance MTA such as Postfix manages multiple concurrent SMTP connections (up to 100 by default). Each SMTP protocol engine and associated TLS engine are managed by one SMTP client process. Updates through the TLSRPT client library will therefore be made concurrently.

Depending on destinations and configuration, one can expect that a typical Postfix MTA will max out at ~300 outbound connections/second. This was ~300 in 2012 when TLS was not as universal as it is now (STARTTLS adds ~three TCP round trip times), and when computers and networks were a bit slower (but not by a lot). See Viktor Dukhovni's post in https://groups.google.com/g/mailing.postfix.users/c/pPcRJFJmdeA

The C client library does not guarantee that a status update will reach a TLSRPT receiver. A status that cannot be sent will be dropped without blocking progress in an MTA. It is therefore OK if the persistence layer cannot accept every status update, however it should not lose updates under forseeable loads.

The design considers using SQLite for storage. By default the SQLite update latency is measured in hundreds of milliseconds, i.e. 10 updates/second where a single Postfix instance needs up to ~300 updates/second. Part of this latency is caused by SQLite invoking fsync() for every update. These fsync() calls would not just slow down SQLite, but they would also hurt MTA performance, especially when a message has multiple SMTP destinations. Postfix is careful to call fsync() only once during the entire lifetime of a message; I had to convince Linux distributions to NOT fsync() the maillog file after every record, because their syslogd daemon was consuming more resources than all Postfix processes combined.

The SQLite update latency can be reduced by 'batching' database updates in a write-ahead log, (for example, PRAGMA journal_mode = WAL; PRAGMA wal_autocheckpoint = 0; PRAGMA synchronous = NORMAL;) but now you need to periodically flush the write-ahead log, or turn on wal_autocheckpoint. For examples, see https://stackoverflow.com/questions/21590824/sqlite-updating-one-record-is-very-relatively-slow

Observations and suggestions

Background

BLohner commented 7 months ago

Regarding: Performance and reliability considerations

Preliminary tests have shown that commit operations are expensive in SQLite. Performing commit operations only every hundred upsert operations gave a performance of over 4500 records per second in a single threaded daemon on an idle system and still over 1500 records per second on a system under load. The tests were done with a loop of 10.000 records.

As we do not need transactional safety for every single record, the expected load should not pose a problem.

We envision two tuneable parameters on the daemon side:

These configuration parameters will not be exclusive but will act in combination. E.g. for a system that commits every 100 records and every five seconds, if 99 records have not yet been committed after a mail burst because no additional mail is received to cause that hundredth record, the data will still be saved to disk after at most five seconds when the timed commit kicks in.

wietse-postfix commented 7 months ago

In my comments below I assume batches with up to 100 updates, and an MTA sending 1500 updates/s

I suppose that the simulation involved a loop around blocking database update calls.

In some TLSRPT design, the MTA sends datagrams to the TLSRPT receiver, so that the MTA will not be blocked by the flow control that is part of a connection-oriented protocol.

Perhaps the TLSRPT receiver implementation can use distinct threads for flushing the database and for receiving updates from the MTA, so that the receiver won't miss too many updates during the database flush every 1/15th second?

Unlike a update-generating loop that blocks when a database flushes buffers, the MTA's updates will arrive stochastically in time. If a single-threaded receiver can handle 1500 updates/s in a blocking flow, then I expect that it will start to miss updates above 500/s with a stochastic flow. What happens in the real world will depend on kernel buffer capacity.