Open MadLittleMods opened 2 years ago
I definitely feel where you're coming from; Complement tests are currently painful to run to the point where it's easy to get distracted on something else whilst waiting for the results to come back.
Regarding 'Docker build is too slow': Wonder if we could build the image once and then mount the source tree into the container as an editable install, to avoid needing to rebuild every time? As long as you don't change dependencies, that should be OK.
Regarding 'Complement being slow': I'm not sure what to best do about this. We could consider our plan to use checkpointing to checkpoint a running container and then cheaply recreate those, rather than building everything up from scratch each time. That said, 20s sounds like a long time to deploy the simple-case image. This probably merits some investigation to see where the time is actually going.
Regarding 'Test output no longer streams as it runs': I don't remember this ever being the case. I'm not sure I consider it that important; personally I don't think I'd use the logs except for after they're done, but if you're aware of an easy way to solve that and it bothers you, then why not.
I think it would be great if running a single test took more like 5 seconds or less.
Test output no longer streams as it runs
Try sticking -p 1
on the complement.sh
commandline.
I think it would be great if running a single test took more like 5 seconds or less.
@reivilibre Great ideas and hope we can get to this point too!
Test output no longer streams as it runs
Try sticking
-p 1
on thecomplement.sh
commandline.
@richvdh Sweet! For some reason I missed this but I see the behavior commented in the relevant Go source code I linked in https://github.com/golang/go/issues/49195
@MadLittleMods how are you finding this now that #13279 has landed?
I hope https://github.com/matrix-org/synapse/pull/13447 will improve this too
how are you finding this now that https://github.com/matrix-org/synapse/pull/13279 has landed?
@richvdh It's a big improvement š I'm getting test results after 46s and the Docker build part is taking around 24s.
With https://github.com/matrix-org/synapse/pull/13447, I'm seeing results after 45s and the Docker build part is taking around 22s.
(just noticed they're both not running any actual tests since that test name has since changed slightly so that is all just Synapse/Complement doing things)
This has regressed hard (over 3 minutes to get results now). I'm getting test results after 193s and the Docker build part is taking around 160s.
develop
0c853e09709d52783efd37060ed9e8f55a4fc704)This has regressed hard (over 3 minutes to get results now). I'm getting test results after 193s and the Docker build part is taking around 160s.
This will be the rust build step, I guess. Maybe we can reorder the dockerfile to cache the rust build output? Otherwise I don't have any great ideas here.
Out of interest, @MadLittleMods have you found #13735 to have improved this at all?
I'm getting test results after ~76s and the Docker build part is taking around ~45s.
develop
755bfeee3a1ac7077045ab9e5a994b6ca89afba3)Thanks. That's heading in the right direction (but is still depressingly high).
The recent Complement changes to enable dirty runs don't help here when you want to edit synapse and run a single test. Most of the time will be sunk into rebuilding from source every time.
Wonder if we could build the image once and then mount the source tree into the container as an editable install, to avoid needing to rebuild every time? As long as you don't change dependencies, that should be OK.
This is what Dendrite does for its local Complement image. It does rebuild the binary via go build
but that is fast. It doesn't install dependencies/etc, instead it volume mounts the dependencies directly into the image. With python not even needing recompilation, it should be possible to do this for Synapse.
It should be noted that since #14548, if you use the complement.sh
script for Synapse with the -e
/--editable
flag, it will do just that ā mount an editable install into the container and avoid rebuilding the container. You won't get any benefit on the first run as it has to build the base image, but subsequent runs should be OK ā unless you modify the Rust source, in which case a rebuild is needed again.
I used to be able to make a change in Synapse and see Complement results after 20 seconds. Now it takes 1 minute 15 seconds just to build the Docker images at the start. And over 2 minutes to see the test results for running a single Complement test. This makes local dev such a pain :feelsgood:
Discussed at https://matrix.to/#/!alCakyySsFIAVfZLDL:matrix.org/$HuKnVFoLYs_uiDNbENnYs_ALqgQKjbhDrm3wbYYkLwI?via=matrix.org&via=element.io&via=termina.org.uk
Docker build is too slow
When I'm just doing feature work, I don't care about workers, I don't care about Postgres (whatever is new and making it slow).
1 minute 15 seconds of Docker building after changing Synapse:
Complement run ./scripts-dev/complement.sh
``` $ COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh -run TestJumpToDateEndpoint/parallel/federation/can_get_pagination_token_after_getting_remote_event_from_timestamp_to_event_endpoint [+] Building 22.0s (25/25) FINISHED => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 5.62kB 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 35B 0.0s => resolve image config for docker.io/docker/dockerfile:1 0.3s => CACHED docker-image://docker.io/docker/dockerfile:1@sha256:443aab4ca21183e069e7d8b2dc68006594f40bddf1b15bbd83f5137bd93e80e2 0.0s => [internal] load build definition from Dockerfile 0.0s => [internal] load .dockerignore 0.0s => [internal] load metadata for docker.io/library/python:3.9-slim 0.3s => [internal] load build context 0.3s => => transferring context: 7.35MB 0.3s => [requirements 1/6] FROM docker.io/library/python:3.9-slim@sha256:c01a2db78654c1923da84aa41b829f6162011e3a75db255c24ea16fa2ad563a0 0.0s => CACHED [builder 2/7] RUN --mount=type=cache,target=/var/cache/apt,sharing=locked --mount=type=cache,target=/var/lib/apt,sharing=locked apt-get update -qq && apt-get install -yqq build-essential 0.0s => CACHED [requirements 2/6] RUN --mount=type=cache,target=/var/cache/apt,sharing=locked --mount=type=cache,target=/var/lib/apt,sharing=locked apt-get update -qq && apt-get install -yqq git && rm -rf / 0.0s => CACHED [requirements 3/6] RUN --mount=type=cache,target=/root/.cache/pip pip install --user "poetry-core==1.1.0a7" "git+https://github.com/python-poetry/poetry.git@fb13b3a676f476177f7937ffa480ee5cff9a90a5" 0.0s => CACHED [requirements 4/6] WORKDIR /synapse 0.0s => CACHED [requirements 5/6] COPY pyproject.toml poetry.lock /synapse/ 0.0s => CACHED [requirements 6/6] RUN /root/.local/bin/poetry export --extras all -o /synapse/requirements.txt ${TEST_ONLY_SKIP_DEP_HASH_VERIFICATION:+--without-hashes} 0.0s => CACHED [builder 3/7] COPY --from=requirements /synapse/requirements.txt /synapse/ 0.0s => CACHED [builder 4/7] RUN --mount=type=cache,target=/root/.cache/pip pip install --prefix="/install" --no-deps --no-warn-script-location -r /synapse/requirements.txt 0.0s => [builder 5/7] COPY synapse /synapse/synapse/ 0.5s => [builder 6/7] COPY pyproject.toml README.rst /synapse/ 0.0s => [builder 7/7] RUN pip install --prefix="/install" --no-deps --no-warn-script-location /synapse 8.8s => CACHED [stage-2 2/5] RUN --mount=type=cache,target=/var/cache/apt,sharing=locked --mount=type=cache,target=/var/lib/apt,sharing=locked apt-get update -qq && apt-get install -yqq curl gosu l 0.0s => [stage-2 3/5] COPY --from=builder /install /usr/local 2.6s => [stage-2 4/5] COPY ./docker/start.py /start.py 0.1s => [stage-2 5/5] COPY ./docker/conf /conf 0.1s => exporting to image 2.5s => => exporting layers 2.5s => => writing image sha256:cf22ee98c4023602eaa86fe9b09d8043f4f01138f2aec7c745387444d0d45249 0.0s => => naming to docker.io/matrixdotorg/synapse 0.0s Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them [+] Building 15.3s (12/12) FINISHED => [internal] load build definition from Dockerfile-workers 0.0s => => transferring dockerfile: 1.45kB 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 35B 0.0s => [internal] load metadata for docker.io/matrixdotorg/synapse:latest 0.0s => [internal] load build context 0.1s => => transferring context: 30.16kB 0.0s => [stage-0 1/7] FROM docker.io/matrixdotorg/synapse:latest 0.6s => [stage-0 2/7] RUN --mount=type=cache,target=/var/cache/apt,sharing=locked --mount=type=cache,target=/var/lib/apt,sharing=locked apt-get update -qq && DEBIAN_FRONTEND=noninteractive apt-get install -y 8.5s => [stage-0 3/7] RUN --mount=type=cache,target=/root/.cache/pip pip install supervisor~=4.2 4.9s => [stage-0 4/7] RUN rm /etc/nginx/sites-enabled/default 0.4s => [stage-0 5/7] COPY ./docker/conf-workers/* /conf/ 0.0s => [stage-0 6/7] COPY ./docker/prefix-log /usr/local/bin/ 0.1s => [stage-0 7/7] COPY ./docker/configure_workers_and_start.py /configure_workers_and_start.py 0.0s => exporting to image 0.4s => => exporting layers 0.4s => => writing image sha256:0b0ae26a584d7ae58927a2daba3b508d61245e1a41d4ac9ba5b92371a96d5445 0.0s => => naming to docker.io/matrixdotorg/synapse-workers 0.0s Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them [+] Building 42.5s (13/13) FINISHED => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 37B 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for docker.io/matrixdotorg/synapse-workers:latest 0.0s => [internal] load build context 0.0s => => transferring context: 177B 0.0s => [1/8] FROM docker.io/matrixdotorg/synapse-workers:latest 0.2s => [2/8] RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -yqq postgresql-13 34.7s => [3/8] RUN pg_ctlcluster 13 main start && su postgres -c "echo "ALTER USER postgres PASSWORD 'somesecret'; CREATE DATABASE synapse ENCODING 'UTF8' LC_COLLATE='C' LC_CTYPE='C' template=template0;" | p 3.8s => [4/8] RUN mv /conf/shared.yaml.j2 /conf/shared-orig.yaml.j2 0.4s => [5/8] COPY conf/workers-shared-extra.yaml.j2 /conf/shared.yaml.j2 0.1s => [6/8] WORKDIR /data 0.1s => [7/8] COPY conf/postgres.supervisord.conf /etc/supervisor/conf.d/postgres.conf 0.1s => [8/8] COPY conf/start_for_complement.sh / 0.1s => exporting to image 2.9s => => exporting layers 2.9s => => writing image sha256:df7d7a1d4443960b49dbb310bdbd4af323c33c7900d26bf291c3fad894683453 0.0s => => naming to docker.io/library/complement-synapse 0.0s Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them Images built; running complement ... ```Relevant code: https://github.com/matrix-org/synapse/blob/dcc7873700da4a818e84c44c6190525d39a854cb/scripts-dev/complement.sh#L89-L107
With cache, the Synapse build used to take ~10s seconds, https://github.com/matrix-org/synapse/pull/9610, which has now doubled to ~20 seconds. And the rest was trivial.
Complement being slow
@kegsay pointed out these couple issues which would also slow down single test runs (stopping and shutting down containers):
The deploys at the start seem noticeable now at 20 seconds:
Test output no longer streams as it runs
This is just a related extra.
The tests used to stream out what it was doing as it went instead of only barfing it all up at the end. This used to work until we got multiple packages in Complement (
tests
andcsapi
), https://github.com/matrix-org/complement/issues/215. This doesn't help with making Complement go faster but it does help with the feeling of it doing something vs just waiting around arbitrarily.The way I workaround it now is to remove the
...
from the end ofhttps://github.com/matrix-org/synapse/blob/dcc7873700da4a818e84c44c6190525d39a854cb/scripts-dev/complement.sh#L164
so it only runs against the single
tests
package where the feature work I care about is in.