matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.82k stars 2.13k forks source link

Explore running Synapse on PyPy #8888

Open callahad opened 3 years ago

callahad commented 3 years ago

Really curious if PyPy might give us greater scaling headroom.

EMS is willing to roll out staging instances based on PyPy for our testing if we supply them with a suitable Docker image. (cc: @jaywink)

callahad commented 3 years ago

Looks like we'd need to resolve #5054, which should be a trivial fix.

callahad commented 3 years ago

There are at least a handful of resolved issues and pull requests from folks claiming to be happily running Synapse on PyPy. Presuming it works, the greatest value might come from actually measuring the CPU/RAM delta between PyPy and CPython for homeservers of various size.

Absolucy commented 3 years ago

@callahad As someone interested in running an instance with PyPy, would you be willing to link these issues/PRs in the main PR body?

clokep commented 3 years ago

3098, #2760, #3462, #7536 are the ones I easily came across.

callahad commented 3 years ago

Specifically, the person who submitted Pull Request #2760 noted:

I ran my production instance (something like 20 active users I'd estimate, and in a couple big rooms too) for two days on PyPy 5.6.0 with this, without a hitch

That was two years ago, but it's nonetheless very encouraging.

Absolucy commented 3 years ago

I modified the Dockerfile to make it successfully Synapse with Postgres and PyPy 3

ARG PYTHON_VERSION=3.7

###
### Stage 0: builder
###
FROM docker.io/pypy:${PYTHON_VERSION}-slim as builder

# install the OS build deps
RUN apt-get update && apt-get install -y \
        curl \
        patch \
        sed \
        build-essential \
        libffi-dev \
        libjpeg-dev \
        libpq-dev \
        libssl-dev \
        libwebp-dev \
        libxml++2.6-dev \
        libxslt1-dev \
        zlib1g-dev \
        && rm -rf /var/lib/apt/lists/*

RUN pypy3 -m ensurepip
RUN pypy3 -m pip install -U pip wheel

# Build dependencies that are not available as wheels, to speed up rebuilds
RUN pypy3 -m pip install --prefix="/install" --no-warn-script-location \
        "jsonschema>=2.5.1" \
        "frozendict>=1" \
        "unpaddedbase64>=1.1.0" \
        "canonicaljson>=1.4.0" \
        "signedjson>=1.1.0" \
        "pynacl>=1.2.1" \
        "idna>=2.5" \
        "service_identity>=18.1.0" \
        "Twisted>=18.9.0" \
        "treq>=15.1" \
        "pyopenssl>=16.0.0" \
        "pyyaml>=3.11" \
        "pyasn1>=0.1.9" \
        "pyasn1-modules>=0.0.7" \
        "bcrypt>=3.1.0" \
        "pillow>=4.3.0" \
        "sortedcontainers>=1.4.4" \
        "pymacaroons>=0.13.0" \
        "msgpack>=0.5.2" \
        "phonenumbers>=8.2.0" \
        "prometheus_client>=0.4.0" \
        "attrs>=19.1.0" \
        "netaddr>=0.7.18" \
        "Jinja2>=2.9" \
        "bleach>=1.4.3" \
        "typing-extensions>=3.7.4" \
        "psycopg2cffi>=2.7" \
        "lxml>=3.5.0" \
        "pyjwt>=1.6.4" \
        "hiredis" \
        "txredisapi>=1.4.7" \
        "authlib>=0.14.0" \
        "pysaml2>=4.5.0"

# now install synapse and all of the python deps to /install.
COPY synapse /synapse/synapse/
COPY scripts /synapse/scripts/
COPY MANIFEST.in README.rst setup.py synctl /synapse/

RUN sed -i 's/psycopg2/psycopg2cffi/' /synapse/synapse/python_dependencies.py
RUN sed -i 's/if sys.version_info >= (3, 7):/if False:/' /synapse/synapse/app/_base.py

RUN pypy3 -m pip install --prefix="/install" --no-warn-script-location \
        /synapse[all]

RUN cd /install/site-packages && curl https://patch-diff.githubusercontent.com/raw/chtd/psycopg2cffi/pull/98.patch | patch -p1

RUN printf 'from psycopg2cffi import compat \ncompat.register()' > /install/site-packages/psycopg2.py

###
### Stage 1: runtime
###

FROM docker.io/pypy:${PYTHON_VERSION}-slim

RUN apt-get update && \ 
        apt-get upgrade -y && \ 
        apt-get install -y \
        curl \
        gosu \
        libjpeg62-turbo \
        libpq5 \
        libwebp6 \
        xmlsec1 && \
        rm -rf /var/lib/apt/lists/*

RUN ln -sf /opt/pypy/bin/pypy3 /bin/python
COPY --from=builder /install /usr/local
COPY --from=builder /install/site-packages /opt/pypy/site-packages
COPY ./docker/start.py /start.py
COPY ./docker/conf /conf

VOLUME ["/data"]

EXPOSE 8008/tcp 8009/tcp 8448/tcp

ENTRYPOINT ["pypy3", "/start.py"]

HEALTHCHECK --interval=1m --timeout=5s \
        CMD curl -fSs http://localhost:8008/health || exit 1
ShadowJonathan commented 3 years ago

Note: psycopg2cffi has had it's last release in 2018: https://pypi.org/project/psycopg2cffi/

Should we really be using that package for this?

callahad commented 3 years ago

It's had quite a few commits since its last release, but according to the author two months ago:

The project is more dead than alive - I can merge simple PR and make releases, but not really anything more complicated. I'd be happy to redirect to supported fork or transfer ownership.

There is sliiiiightly more current fork at https://github.com/Omegapol/psycopg2cffi.

But I wouldn't be especially concerned about it. It's not an especially huge module, and libpq does the heavy lifting.

Plus, what alternative do we have?

ShadowJonathan commented 3 years ago

Yeah, i got that distinction when i started digging a little more, i'm just noting it because it might become a compatibility issue down the line :sweat_smile:

intelfx commented 3 years ago

I just gave the idea another try, now with a bit more time on hands than the last time.

TL;DR (primarily for other pioneers):

With this done, it seems to work \o/... at least for now

ShadowJonathan commented 3 years ago

Something interesting, psycopg2cffi seems to have brought update 2.9 yesterday, while it already had a 2.8 version, it is interesting there is still activity in this repo, as i thought there wasn't at all.

Though some (read: a lot of) testing needs to be done to assure that psycopg2cffi>=2.8 will work correctly with a pypy release.

intelfx commented 3 years ago

Something interesting, psycopg2cffi seems to have brought update 2.9 yesterday, while it already had a 2.8 version, it is interesting there is still activity in this repo, as i thought there wasn't at all.

Though some (read: a lot of) testing needs to be done to assure that psycopg2cffi>=2.8 will work correctly with a pypy release.

Nothing interesting, completely expected :) I asked the acting maintainer to do a release after contributing an update to synchronise API with upstream psycopg2.

Synapse 1.26.0 should be compatible with psycopg2cffi 2.9. Still need to apply a few fixes to Synapse proper; sadly I did not submit them in time for 1.26.0.

ShadowJonathan commented 3 years ago

@intelfx do you have a PR which references those fixes somewhere?

intelfx commented 3 years ago

No. Actually let me just do it right now.

ShadowJonathan commented 3 years ago

https://github.com/pallets/markupsafe/issues/176 is an issue for this, affecting jinja2.escape (which is implicitly used in a lot of templates), mangling the URLs or other strings that're used in there. I've noted in #9123 how to potentially deal with this issue, but for now it's not really fixable.

ShadowJonathan commented 3 years ago

C-API interfacing on pypy is incredibly slow (also see this), after #9123 is merged, i'll bring in pg8000 as a new database adapter, to speed up pypy interfacing with postgres.

Unfortunately no pure-python sqlite adapter exists, so pypy with sqlite will be incredibly slow :(

ShadowJonathan commented 2 years ago

I summarised the difficulties i encountered with adding a database adapter for pypy-powered synapse in this issue; https://github.com/matrix-org/synapse/issues/11756

ShadowJonathan commented 2 years ago

That issue has been closed; psycopg3 has a pure-python option, this could work great for pypy deployments, as that pure python will run (almost) as fast as a native-code option, being able to take advantage of the JIT.

In other words, using psycopg3 will mean pypy is then a viable option.