matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.82k stars 2.13k forks source link

Synapse sends SMTP retries in quick succession if SMTP server is down #8145

Open Traumflug opened 4 years ago

Traumflug commented 4 years ago

Description

If the SMTP server doesn't work perfectly for some reason, Synapse starts a Denial-of-Service (DoS) attack on that server. This triggers spam/DoS protection like Fail2Ban or UFW 'limit' rules on the server, of course, putting the Synapse server onto a temporary or permanent blacklist (on the SMTP server).

Steps to reproduce

After doing so, one can find this in the logs, 5 "retries" within less than 0.1 seconds:

2020-07-31 15:02:14,371 - synapse.push.mailer - 301 - INFO - POST-11 - Sending email to mah@jump-ing.de
2020-07-31 15:02:14,668 - twisted - 192 - INFO -  - SMTP Client retrying server. Retry: 5
2020-07-31 15:02:14,684 - twisted - 192 - INFO -  - SMTP Client retrying server. Retry: 4
2020-07-31 15:02:14,702 - twisted - 192 - INFO -  - SMTP Client retrying server. Retry: 3
2020-07-31 15:02:14,718 - twisted - 192 - INFO -  - SMTP Client retrying server. Retry: 2
2020-07-31 15:02:14,736 - twisted - 192 - INFO -  - SMTP Client retrying server. Retry: 1

Blindly retrying to connect to a SMTP server is entirely pointless. SMTP uses TCP/IP, which is a reliable protocol (it retries on its own on errors), so there are no network errors one can work around by retrying immediately. Retrying only makes sense after looking at the error and acting sensible on that. In case one can't connect at all, it makes no sense to try again within less than a couple of minutes.

Version information

Traumflug commented 4 years ago

A workaround is to run this after installation (and after each update):

find /opt/venvs/matrix-synapse -type f -name \*.py ! -path \*/pip/\* -exec grep -q 'retries=[1-9]' {} \; -print -exec sed -i 's/retries=[1-9]/retries=0/g' {} \;
richvdh commented 4 years ago

I think this behaviour is coming from Twisted rather than synapse itself. You might consider raising an issue at https://twistedmatrix.com/trac, however:

Retrying only makes sense after looking at the error and acting sensible on that.

As far as I can tell, this behaviour does not happen when the server returns a 4xx error.

This triggers spam/DoS protection like Fail2Ban or UFW 'limit' rules on the server

I'm somewhat intrigued to know what is causing the SMTP request to fail in such a way that twisted chooses to retry the request, and yet synapse is added to a blacklist? It seems to me that Twisted will only retry the message in the case that the SMTP server fails to respond, in which case I'm surprised that it triggers blacklisting rules.

I think it's somewhat disingenuous to describe 5 connection attempts as a DoS attack.

Traumflug commented 4 years ago

Thanks for looking into the issue.

I think this behaviour is coming from Twisted rather than synapse itself.

Looking at the code, twisted defines defaults. Synapse could (and should) fix the problem by simply passing the right parameter (retries = 0). Still I think it's indeed a good idea to file an issue there as well.

As far as I can tell, this behaviour does not happen when the server returns a 4xx error.

SMTP doesn't use the HTTP protocol, so there is no such thing like a "404 error".

I think it's somewhat disingenuous to describe 5 connection attempts as a DoS attack.

It's not about dealing with DoS attacks, it's about preventing them. Which means, one has to catch them early, before processing resources are exhausted. That's the point of Fail2Ban and describes what it does: observe erratic, obviously pointless behavior and ban hosts with such behavior for a while.

richvdh commented 4 years ago

As far as I can tell, this behaviour does not happen when the server returns a 4xx error.

SMTP doesn't use the HTTP protocol, so there is no such thing like a "404 error".

Nevertheless, SMTP does have numeric error codes. Honestly I'm not sure which error codes twisted does retry for.

I think it's somewhat disingenuous to describe 5 connection attempts as a DoS attack.

It's not about dealing with DoS attacks, it's about preventing them.

My point is, you have described Synapse's behaviour as a DoS attack: "Synapse turns SMTP failures into DoS attacks". I think that is a severe exaggeration.

I'm somewhat intrigued to know what is causing the SMTP request to fail in such a way that twisted chooses to retry the request, and yet synapse is added to a blacklist?

I notice you've not replied to this. It's hard to gauge how important this actually is without some examples of scenarios in which it's an actual problem.

Traumflug commented 4 years ago

Ticket on Twisted: https://twistedmatrix.com/trac/ticket/9946

Traumflug commented 4 years ago

It's hard to gauge how important this actually is without some examples of scenarios in which it's an actual problem.

Well, I tried to describe such a scenario in the first post: install Synapse unmodified and try to use/add/modify an email address. Doing so banned me from my own server and also bans every other user dealing with an email address.

It doesn't ban if there are no DoS protection measures in place, of course, so developers might not notice.

Traumflug commented 4 years ago

If it helps a bit: Riot/Elements also has such misbehavior, it does even 8 retries in some situations. It was quite a rocky ride to get this pair of software running :-)

Traumflug commented 4 years ago

Code doing these retries is in /opt/venvs/matrix-synapse/lib/python3.7/site-packages/twisted/mail/smtp.py (installed APT package). Defaulting to 5 is in line 1867:

class SMTPSenderFactory(protocol.ClientFactory):
[...]
    def __init__(self, fromEmail, toEmail, file, deferred, retries=5,
                 timeout=None):

To my (limited) knowledge of Python, this code starting at line 1925 looks like the code retrying:

    def clientConnectionLost(self, connector, err):
        self._processConnectionError(connector, err)

    def _processConnectionError(self, connector, err):
        self.currentProtocol = None
        if (self.retries < 0) and (not self.sendFinished):
            log.msg("SMTP Client retrying server. Retry: %s" % -self.retries)

            # Rewind the file in case part of it was read while attempting to
            # send the message.
            self.file.seek(0, 0)
            connector.connect()
            self.retries += 1

I see no check against error codes coming back from the server there.

Traumflug commented 4 years ago

Caveat with manually patched files is that patches vanish on package upgrades. Here's a procedure to patch distribution files right on APT installation and upgrades. For best results, install this before installing the APT package.

All this applies to Debian/Ubuntu installations using the matrix.org package repository. Actually tested on Debian 10 buster. Doesn't work with packages coming with Debian or Debian-Backports, because Debian packages place the installation not in /opt/venvs. It shouldn't be too hard to adjust installation paths, though.

  1. Create a folder for the script:
    sudo mkdir /etc/apt/post-inst
  2. Instruct APT to run the patch script. Create a file /etc/apt/apt.conf.d/99matrix-synapse with this content:
    DPkg::Post-Invoke {"/etc/apt/post-inst/matrix-synapse.sh";};
  3. Patching script is this, create a file /etc/apt/post-inst/matrix-synapse.sh with this content:
    # This script gets called by /etc/apt/apt.conf.d/99matrix-synapse.
    [ -d /opt/venvs/matrix-synapse ] || exit 0
    echo "Applying necessary matrix-synapse patches."
    SOMETHING='false'
    grep --include \*.py -rl \
      -e 'ssl.SSL.TLSv1_METHOD' \
      -e 'retries=[1-9]' \
      /opt/venvs/matrix-synapse \
    | while read F; do
      echo "Patching ${F} ..."
      SOMETHING='true'
      cp -p "${F}" "${F}.org"
      sed -i '
        /ssl.SSL.TLSv1_METHOD/d
        s/retries=[1-9]/retries=0/g
      ' "${F}"
    done
    if [ ${SOMETHING} = 'true' ]; then
      echo "All matrix-synapse patches applied."
    else
      echo "There were no patches to apply to matrix-synapse."
    fi
  4. Make the script executable:
    sudo chmod +x /etc/apt/post-inst/matrix-synapse.sh

For those using Ansible, here's a playbook snippet to do the above by running the playbook. Place these three rules before the rule to install Synapse, this makes manual patching obsolete.

  tasks:
    - name: Create Directory for Post-APT Script
      file:
        path: /etc/apt/post-inst
        state: directory
        recurse: yes
        owner: root
        group: root
        mode: 0755
    - name: Deploy Matrix Post-APT Script
      # This script patches the distribution to work around these bugs:
      # - https://github.com/matrix-org/synapse/issues/6211
      # - https://github.com/matrix-org/synapse/issues/8145
      copy:
        dest: /etc/apt/post-inst/matrix-synapse.sh
        owner: root
        group: root
        mode: 0755
        content: |
          # This script gets called by /etc/apt/apt.conf.d/99matrix-synapse.
          [ -d /opt/venvs/matrix-synapse ] || exit 0
          echo "Applying necessary matrix-synapse patches."
          SOMETHING='false'
          grep --include \*.py -rl \
            -e 'ssl.SSL.TLSv1_METHOD' \
            -e 'retries=[1-9]' \
            /opt/venvs/matrix-synapse \
          | while read F; do
            echo "Patching ${F} ..."
            SOMETHING='true'
            cp -p "${F}" "${F}.org"
            sed -i '
              /ssl.SSL.TLSv1_METHOD/d
              s/retries=[1-9]/retries=0/g
            ' "${F}"
          done
          if [ ${SOMETHING} = 'true' ]; then
            echo "All matrix-synapse patches applied."
          else
            echo "There were no patches to apply to matrix-synapse."
          fi
    - name: Install Matrix Post-APT Script
      copy:
        dest: /etc/apt/apt.conf.d/99matrix-synapse
        owner: root
        group: root
        mode: 0644
        content: |
          DPkg::Post-Invoke {"/etc/apt/post-inst/matrix-synapse.sh";};
ShadowJonathan commented 3 years ago

Seeing as how https://github.com/matrix-org/synapse/issues/10378 turned out, shouldnt this be a twisted issue?

erikjohnston commented 3 years ago

This is sub optimal, but I think needs to be fixed in Twisted.

I don't know if we want to leave this open until then?

richvdh commented 3 years ago

I think we may as well keep it open to track it.