TeamNewPipe / CrashReportImporter

NewPipe Crash Report Importer for Sentry
10 stars 6 forks source link

Stream processing instead of IMAP fetching #5

Closed TheAssassin closed 1 year ago

TheAssassin commented 4 years ago

The way this importer works is very expensive. The mail server has to put mails into a mailbox which is then queried via IMAP. This not only takes time but generates high load on the mail host and involves saving many copies of the data.

This importer should ideally do some sort of "stream processing", that is, have mails forwarded from the MTA directly, bypassing the MDA. The usual checks, e.g., for spam, should still be run.

This could either be implemented by using some sort of "pipe alias", as some MTAs provide (i.e., calling a script if a mail, or maybe even easier, this project could host its own SMTP service and accept mails this way. This is also how many services are included in mail pipelines (accept mails via SMTP and return them to the main MTA to a special SMTP endpoint).

Going for our own SMTP service also makes integration a lot easier, as there's no more mailbox required, mails can simply be spam-checked and then forwarded to a special, non-public address.

TheAssassin commented 4 years ago

I've recently been working with asyncio. It's pretty fast and easy to program. Also, there's support for testing e.g., in pytest (through a plugin).

Nowadays there's also libraries for almost anything. I just found an SMTP server library: https://aiosmtpd.readthedocs.io/en/latest/README.html. It even supports LMTP, which is probably what we'd wand to use here.

There's also a client library: https://aiosmtplib.readthedocs.io/en/latest/overview.html.

We'd have to see if integrating spamassassin & co. is easier in the main MTA pipeline or if we just integrate it in our own input handling.

TheAssassin commented 1 year ago

This has been fixed quite a while ago. The importer receives mails via LMTP now.