palant / opensmtpd-filters

Converting DMARC aggregate reports into human-readable HTML
MIT License
17 stars 1 forks source link

Pipe is a valid character in emails #3

Open renegm opened 1 month ago

renegm commented 1 month ago

There's a problem args = payload.split('|'). When email contains '|' (and yes, its valid) it will split email address Error: TypeError: nothing() takes 4 positional arguments but 5 were given I made a crude patch:

                        if phase == 'tx-mail':
                            args = payload.split('|', 2)
                        elif phase == 'tx-rcpt':
                            args = payload.split('|', 1)
                        else:
                            args = payload.split('|')

but there are others events/reports like smtp-in|protocol-client|blabla|MAIL FROM:<from@home.local> P.S Thanks. I'm making a filter and this class is a great help.

P.S.2 Unnecessary, I guess, but here is my test filter:

from opensmtpd import FilterServer

def start():
    server = FilterServer()
    server.track_context()
    server.register_handler('report', 'tx-mail', nothing)
    server.serve_forever()

def nothing(session, _, result, address):
    return

if __name__ == '__main__':
    start()
palant commented 1 month ago

What does payload look like then? Presumably, OpenSMTPd performs no escaping or the like?

palant commented 1 month ago

Note: Pipe symbol is technically valid in email addresses but its use is discouraged.

Some MTAs will filter addresses with the pipe symbol, because it has been used in attacking email systems. Don't use it.

I guess OpenSMTPd isn’t one of those MTAs. Still quite an edge case.

renegm commented 1 month ago

I agree that it is rare and discouraged and many other things. Still... Its easy enough to manage this case instead of ignoring it:

if phase in {'link-greeting','link-tls','tx-reset','tx-begin','tx-rollback','protocol-client','protocol-server'} no split
if phase in {'link-identify','link-auth','tx-envelope','tx-data','tx-commit'} split(|,1)

etc.

renegm commented 1 month ago

I made a patch.:

        while True:
            line = self.recv()

            if (count := line.count('|')) < 5 or not line.startswith('report') or not line.startswith('filter'):
                continue

            if count == 5:
                event, version, timestamp, subsystem, phase, session = line.split('|', 5)
            else:
                event, version, timestamp, subsystem, phase, session, payload = line.split('|', 6)

            if event == 'filter':  #  filters always have a token
                token, payload = payload.split('|', 1)

            if phase in {'link-disconnect', 'timeout', 'data', 'commit'}:
                args = []
            elif phase in {'link-identify', 'link-auth', 'tx-envelope', 'tx-data', 'tx-commit'}:
                args = payload.split('|', 1)
            elif phase == 'tx-mail' and re.search(r'^0\.[1-4]$', version):
                # Older protocol versions had result and sender reversed. 
                # I keep it, but does anybody use version 4 or earlier?
                a_aux = payload.split('|', 1)
                b_aux = a_aux[1].rsplit('|', 1)
                args = [a_aux[0], b_aux[1], b_aux[0]]
            elif phase in {'tx-mail', 'tx-rcpt', 'filter-report', 'filter-response'}:
                args = payload.split('|', 2)
            elif phase in {'link-connect', 'connect'}:
                args = payload.split('|', 3)
            else:
                args = [payload]

            if event == 'report':
                self._call_handlers(noop, event, phase, session, *args)
            elif phase == 'data-line':
                self._call_handlers(noop, event, phase, session, payload, send_dataline)
            else:
                self._call_handlers(send_filter_response, event, phase, session, *args)