Open callahad opened 3 years ago
Looks like we'd need to resolve #5054, which should be a trivial fix.
There are at least a handful of resolved issues and pull requests from folks claiming to be happily running Synapse on PyPy. Presuming it works, the greatest value might come from actually measuring the CPU/RAM delta between PyPy and CPython for homeservers of various size.
@callahad As someone interested in running an instance with PyPy, would you be willing to link these issues/PRs in the main PR body?
Specifically, the person who submitted Pull Request #2760 noted:
I ran my production instance (something like 20 active users I'd estimate, and in a couple big rooms too) for two days on PyPy 5.6.0 with this, without a hitch
That was two years ago, but it's nonetheless very encouraging.
I modified the Dockerfile to make it successfully Synapse with Postgres and PyPy 3
ARG PYTHON_VERSION=3.7
###
### Stage 0: builder
###
FROM docker.io/pypy:${PYTHON_VERSION}-slim as builder
# install the OS build deps
RUN apt-get update && apt-get install -y \
curl \
patch \
sed \
build-essential \
libffi-dev \
libjpeg-dev \
libpq-dev \
libssl-dev \
libwebp-dev \
libxml++2.6-dev \
libxslt1-dev \
zlib1g-dev \
&& rm -rf /var/lib/apt/lists/*
RUN pypy3 -m ensurepip
RUN pypy3 -m pip install -U pip wheel
# Build dependencies that are not available as wheels, to speed up rebuilds
RUN pypy3 -m pip install --prefix="/install" --no-warn-script-location \
"jsonschema>=2.5.1" \
"frozendict>=1" \
"unpaddedbase64>=1.1.0" \
"canonicaljson>=1.4.0" \
"signedjson>=1.1.0" \
"pynacl>=1.2.1" \
"idna>=2.5" \
"service_identity>=18.1.0" \
"Twisted>=18.9.0" \
"treq>=15.1" \
"pyopenssl>=16.0.0" \
"pyyaml>=3.11" \
"pyasn1>=0.1.9" \
"pyasn1-modules>=0.0.7" \
"bcrypt>=3.1.0" \
"pillow>=4.3.0" \
"sortedcontainers>=1.4.4" \
"pymacaroons>=0.13.0" \
"msgpack>=0.5.2" \
"phonenumbers>=8.2.0" \
"prometheus_client>=0.4.0" \
"attrs>=19.1.0" \
"netaddr>=0.7.18" \
"Jinja2>=2.9" \
"bleach>=1.4.3" \
"typing-extensions>=3.7.4" \
"psycopg2cffi>=2.7" \
"lxml>=3.5.0" \
"pyjwt>=1.6.4" \
"hiredis" \
"txredisapi>=1.4.7" \
"authlib>=0.14.0" \
"pysaml2>=4.5.0"
# now install synapse and all of the python deps to /install.
COPY synapse /synapse/synapse/
COPY scripts /synapse/scripts/
COPY MANIFEST.in README.rst setup.py synctl /synapse/
RUN sed -i 's/psycopg2/psycopg2cffi/' /synapse/synapse/python_dependencies.py
RUN sed -i 's/if sys.version_info >= (3, 7):/if False:/' /synapse/synapse/app/_base.py
RUN pypy3 -m pip install --prefix="/install" --no-warn-script-location \
/synapse[all]
RUN cd /install/site-packages && curl https://patch-diff.githubusercontent.com/raw/chtd/psycopg2cffi/pull/98.patch | patch -p1
RUN printf 'from psycopg2cffi import compat \ncompat.register()' > /install/site-packages/psycopg2.py
###
### Stage 1: runtime
###
FROM docker.io/pypy:${PYTHON_VERSION}-slim
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y \
curl \
gosu \
libjpeg62-turbo \
libpq5 \
libwebp6 \
xmlsec1 && \
rm -rf /var/lib/apt/lists/*
RUN ln -sf /opt/pypy/bin/pypy3 /bin/python
COPY --from=builder /install /usr/local
COPY --from=builder /install/site-packages /opt/pypy/site-packages
COPY ./docker/start.py /start.py
COPY ./docker/conf /conf
VOLUME ["/data"]
EXPOSE 8008/tcp 8009/tcp 8448/tcp
ENTRYPOINT ["pypy3", "/start.py"]
HEALTHCHECK --interval=1m --timeout=5s \
CMD curl -fSs http://localhost:8008/health || exit 1
Note: psycopg2cffi
has had it's last release in 2018: https://pypi.org/project/psycopg2cffi/
Should we really be using that package for this?
It's had quite a few commits since its last release, but according to the author two months ago:
The project is more dead than alive - I can merge simple PR and make releases, but not really anything more complicated. I'd be happy to redirect to supported fork or transfer ownership.
There is sliiiiightly more current fork at https://github.com/Omegapol/psycopg2cffi.
But I wouldn't be especially concerned about it. It's not an especially huge module, and libpq
does the heavy lifting.
Plus, what alternative do we have?
Yeah, i got that distinction when i started digging a little more, i'm just noting it because it might become a compatibility issue down the line :sweat_smile:
I just gave the idea another try, now with a bit more time on hands than the last time.
TL;DR (primarily for other pioneers):
psycopg2.extras
, cf. https://github.com/intelfx/psycopg2cffi/commit/a9d67480110c2ef864c5d6ed425ec8ed6b1489acWith this done, it seems to work \o/... at least for now
Something interesting, psycopg2cffi
seems to have brought update 2.9
yesterday, while it already had a 2.8
version, it is interesting there is still activity in this repo, as i thought there wasn't at all.
Though some (read: a lot of) testing needs to be done to assure that psycopg2cffi>=2.8
will work correctly with a pypy release.
Something interesting,
psycopg2cffi
seems to have brought update2.9
yesterday, while it already had a2.8
version, it is interesting there is still activity in this repo, as i thought there wasn't at all.Though some (read: a lot of) testing needs to be done to assure that
psycopg2cffi>=2.8
will work correctly with a pypy release.
Nothing interesting, completely expected :) I asked the acting maintainer to do a release after contributing an update to synchronise API with upstream psycopg2.
Synapse 1.26.0 should be compatible with psycopg2cffi 2.9. Still need to apply a few fixes to Synapse proper; sadly I did not submit them in time for 1.26.0.
@intelfx do you have a PR which references those fixes somewhere?
No. Actually let me just do it right now.
https://github.com/pallets/markupsafe/issues/176 is an issue for this, affecting jinja2.escape
(which is implicitly used in a lot of templates), mangling the URLs or other strings that're used in there. I've noted in #9123 how to potentially deal with this issue, but for now it's not really fixable.
C-API interfacing on pypy is incredibly slow (also see this), after #9123 is merged, i'll bring in pg8000
as a new database adapter, to speed up pypy interfacing with postgres.
Unfortunately no pure-python sqlite adapter exists, so pypy with sqlite will be incredibly slow :(
I summarised the difficulties i encountered with adding a database adapter for pypy-powered synapse in this issue; https://github.com/matrix-org/synapse/issues/11756
That issue has been closed; psycopg3 has a pure-python option, this could work great for pypy deployments, as that pure python will run (almost) as fast as a native-code option, being able to take advantage of the JIT.
In other words, using psycopg3 will mean pypy is then a viable option.
Really curious if PyPy might give us greater scaling headroom.
EMS is willing to roll out staging instances based on PyPy for our testing if we supply them with a suitable Docker image. (cc: @jaywink)