Open Traumflug opened 4 years ago
A workaround is to run this after installation (and after each update):
find /opt/venvs/matrix-synapse -type f -name \*.py ! -path \*/pip/\* -exec grep -q 'retries=[1-9]' {} \; -print -exec sed -i 's/retries=[1-9]/retries=0/g' {} \;
I think this behaviour is coming from Twisted rather than synapse itself. You might consider raising an issue at https://twistedmatrix.com/trac, however:
Retrying only makes sense after looking at the error and acting sensible on that.
As far as I can tell, this behaviour does not happen when the server returns a 4xx error.
This triggers spam/DoS protection like Fail2Ban or UFW 'limit' rules on the server
I'm somewhat intrigued to know what is causing the SMTP request to fail in such a way that twisted chooses to retry the request, and yet synapse is added to a blacklist? It seems to me that Twisted will only retry the message in the case that the SMTP server fails to respond, in which case I'm surprised that it triggers blacklisting rules.
I think it's somewhat disingenuous to describe 5 connection attempts as a DoS attack.
Thanks for looking into the issue.
I think this behaviour is coming from Twisted rather than synapse itself.
Looking at the code, twisted defines defaults. Synapse could (and should) fix the problem by simply passing the right parameter (retries = 0). Still I think it's indeed a good idea to file an issue there as well.
As far as I can tell, this behaviour does not happen when the server returns a 4xx error.
SMTP doesn't use the HTTP protocol, so there is no such thing like a "404 error".
I think it's somewhat disingenuous to describe 5 connection attempts as a DoS attack.
It's not about dealing with DoS attacks, it's about preventing them. Which means, one has to catch them early, before processing resources are exhausted. That's the point of Fail2Ban and describes what it does: observe erratic, obviously pointless behavior and ban hosts with such behavior for a while.
As far as I can tell, this behaviour does not happen when the server returns a 4xx error.
SMTP doesn't use the HTTP protocol, so there is no such thing like a "404 error".
Nevertheless, SMTP does have numeric error codes. Honestly I'm not sure which error codes twisted does retry for.
I think it's somewhat disingenuous to describe 5 connection attempts as a DoS attack.
It's not about dealing with DoS attacks, it's about preventing them.
My point is, you have described Synapse's behaviour as a DoS attack: "Synapse turns SMTP failures into DoS attacks". I think that is a severe exaggeration.
I'm somewhat intrigued to know what is causing the SMTP request to fail in such a way that twisted chooses to retry the request, and yet synapse is added to a blacklist?
I notice you've not replied to this. It's hard to gauge how important this actually is without some examples of scenarios in which it's an actual problem.
Ticket on Twisted: https://twistedmatrix.com/trac/ticket/9946
It's hard to gauge how important this actually is without some examples of scenarios in which it's an actual problem.
Well, I tried to describe such a scenario in the first post: install Synapse unmodified and try to use/add/modify an email address. Doing so banned me from my own server and also bans every other user dealing with an email address.
It doesn't ban if there are no DoS protection measures in place, of course, so developers might not notice.
If it helps a bit: Riot/Elements also has such misbehavior, it does even 8 retries in some situations. It was quite a rocky ride to get this pair of software running :-)
Code doing these retries is in /opt/venvs/matrix-synapse/lib/python3.7/site-packages/twisted/mail/smtp.py (installed APT package). Defaulting to 5 is in line 1867:
class SMTPSenderFactory(protocol.ClientFactory):
[...]
def __init__(self, fromEmail, toEmail, file, deferred, retries=5,
timeout=None):
To my (limited) knowledge of Python, this code starting at line 1925 looks like the code retrying:
def clientConnectionLost(self, connector, err):
self._processConnectionError(connector, err)
def _processConnectionError(self, connector, err):
self.currentProtocol = None
if (self.retries < 0) and (not self.sendFinished):
log.msg("SMTP Client retrying server. Retry: %s" % -self.retries)
# Rewind the file in case part of it was read while attempting to
# send the message.
self.file.seek(0, 0)
connector.connect()
self.retries += 1
I see no check against error codes coming back from the server there.
Caveat with manually patched files is that patches vanish on package upgrades. Here's a procedure to patch distribution files right on APT installation and upgrades. For best results, install this before installing the APT package.
All this applies to Debian/Ubuntu installations using the matrix.org package repository. Actually tested on Debian 10 buster. Doesn't work with packages coming with Debian or Debian-Backports, because Debian packages place the installation not in /opt/venvs
. It shouldn't be too hard to adjust installation paths, though.
sudo mkdir /etc/apt/post-inst
/etc/apt/apt.conf.d/99matrix-synapse
with this content:
DPkg::Post-Invoke {"/etc/apt/post-inst/matrix-synapse.sh";};
/etc/apt/post-inst/matrix-synapse.sh
with this content:
# This script gets called by /etc/apt/apt.conf.d/99matrix-synapse.
[ -d /opt/venvs/matrix-synapse ] || exit 0
echo "Applying necessary matrix-synapse patches."
SOMETHING='false'
grep --include \*.py -rl \
-e 'ssl.SSL.TLSv1_METHOD' \
-e 'retries=[1-9]' \
/opt/venvs/matrix-synapse \
| while read F; do
echo "Patching ${F} ..."
SOMETHING='true'
cp -p "${F}" "${F}.org"
sed -i '
/ssl.SSL.TLSv1_METHOD/d
s/retries=[1-9]/retries=0/g
' "${F}"
done
if [ ${SOMETHING} = 'true' ]; then
echo "All matrix-synapse patches applied."
else
echo "There were no patches to apply to matrix-synapse."
fi
sudo chmod +x /etc/apt/post-inst/matrix-synapse.sh
For those using Ansible, here's a playbook snippet to do the above by running the playbook. Place these three rules before the rule to install Synapse, this makes manual patching obsolete.
tasks:
- name: Create Directory for Post-APT Script
file:
path: /etc/apt/post-inst
state: directory
recurse: yes
owner: root
group: root
mode: 0755
- name: Deploy Matrix Post-APT Script
# This script patches the distribution to work around these bugs:
# - https://github.com/matrix-org/synapse/issues/6211
# - https://github.com/matrix-org/synapse/issues/8145
copy:
dest: /etc/apt/post-inst/matrix-synapse.sh
owner: root
group: root
mode: 0755
content: |
# This script gets called by /etc/apt/apt.conf.d/99matrix-synapse.
[ -d /opt/venvs/matrix-synapse ] || exit 0
echo "Applying necessary matrix-synapse patches."
SOMETHING='false'
grep --include \*.py -rl \
-e 'ssl.SSL.TLSv1_METHOD' \
-e 'retries=[1-9]' \
/opt/venvs/matrix-synapse \
| while read F; do
echo "Patching ${F} ..."
SOMETHING='true'
cp -p "${F}" "${F}.org"
sed -i '
/ssl.SSL.TLSv1_METHOD/d
s/retries=[1-9]/retries=0/g
' "${F}"
done
if [ ${SOMETHING} = 'true' ]; then
echo "All matrix-synapse patches applied."
else
echo "There were no patches to apply to matrix-synapse."
fi
- name: Install Matrix Post-APT Script
copy:
dest: /etc/apt/apt.conf.d/99matrix-synapse
owner: root
group: root
mode: 0644
content: |
DPkg::Post-Invoke {"/etc/apt/post-inst/matrix-synapse.sh";};
Seeing as how https://github.com/matrix-org/synapse/issues/10378 turned out, shouldnt this be a twisted issue?
This is sub optimal, but I think needs to be fixed in Twisted.
I don't know if we want to leave this open until then?
I think we may as well keep it open to track it.
Description
If the SMTP server doesn't work perfectly for some reason, Synapse starts a Denial-of-Service (DoS) attack on that server. This triggers spam/DoS protection like Fail2Ban or UFW 'limit' rules on the server, of course, putting the Synapse server onto a temporary or permanent blacklist (on the SMTP server).
Steps to reproduce
After doing so, one can find this in the logs, 5 "retries" within less than 0.1 seconds:
Blindly retrying to connect to a SMTP server is entirely pointless. SMTP uses TCP/IP, which is a reliable protocol (it retries on its own on errors), so there are no network errors one can work around by retrying immediately. Retrying only makes sense after looking at the error and acting sensible on that. In case one can't connect at all, it makes no sense to try again within less than a couple of minutes.
Version information
Homeserver: localhost
Version:
Install method:
apt-get install matrix-synapse-py3
with matrix.org's repository (https://packages.matrix.org/debian/) installed.Platform: Debian 10.5 buster on a virtual machine (in order to test before rolling out to a public server).