Open eMPee584 opened 3 years ago
You ran the migration steps in our changelog and got into some dead-end? I think those do tell you to stop all services and then start just Postgres, which should help..?
so I'm having an identical issue on an updated server. Everything prior to the 1-21-2021 update was working well. I got around to doing the update yesterday and ran into trouble. Maybe I can shed some light on this issue with a more descriptive post. In running the steps provided on step 7 where one is supposed to log into the psql instance,
/usr/local/bin/matrix-postgres-cli
I get no response and and stalled cli. It's just sitting there as if it's waiting for something. I can confirm that the process in the script seem to be running as an inspection of processes are as follows,
root 10665 10664 0 13:35 pts/1 00:00:00 docker run -it --rm --user=999:1001 --cap-drop=ALL --env-file=/matrix/postgres/env-postgres-psql --network matrix docker.io/postgres:13.1-alpine psql -h matrix-postgres
but no prompt or any other indication is present. This is currently preventing me from restarting the whole matrix service with an error similar to that above. I've tried removing the password in the vars file and checking out a version from before 1-22-2021 but am presented with other errors when attempting to start. I'd like to press forward and get this working although I'm not currently using any bridges that I would be concerned about.
Ok, so I have more on this one... I waited longer this time (several minutes) and I get a timeout on connecting to
sudo /usr/local/bin/matrix-postgres-cli psql: error: could not connect to server: Operation timed out Is the server running on host "matrix-postgres" (23.221.222.250) and accepting TCP/IP connections on port 5432? could not connect to server: Operation timed out Is the server running on host "matrix-postgres" (23.217.138.110) and accepting TCP/IP connections on port 5432?
So those clearly aren't on my network, so a little poking found this,
nslookup matrix.synapse.org Server: 127.0.0.53 Address: 127.0.0.53#53 Non-authoritative answer: Name: matrix.synapse.org Address: 23.221.222.250 Name: matrix.synapse.org Address: 23.202.231.169
For some reason that matrix-postgres-cli is trying to connect to synapse.org servers. At least on my machine.
Got it working on an old commit of this repo and by dropping pw (using old default). Since it's running on a VM, I'm going to clone it and play with it in an offline state. The latest issue I'm seeing is related to manually starting the postgres container. For some reason it starts fine at boot as a service, but manually it seems to fail 4 out of 5 times with a message like the following. Very odd for a system that I'd had nearly not issues setting up or using for a month or so.
● matrix-postgres.service - Matrix Postgres server Loaded: loaded (/etc/systemd/system/matrix-postgres.service; enabled; vendor preset: enabled) Active: activating (auto-restart) (Result: exit-code) since Thu 2021-02-04 13:09:42 UTC; 6s ago Process: 19208 ExecStart=/usr/bin/env docker run --rm --name matrix-postgres --log-driver=none --user=999:1001 --cap-drop=ALL --read-only --tmpfs=/tmp:rw,noexec,nosuid,size=100m --tmpfs=/run/postgres Process: 19198 ExecStartPre=/usr/bin/env sh -c /usr/bin/env docker rm matrix-postgres 2>/dev/null (code=exited, status=1/FAILURE) Process: 19176 ExecStartPre=/usr/bin/env docker stop matrix-postgres (code=exited, status=1/FAILURE) Main PID: 19208 (code=exited, status=125) Feb 04 13:09:42 thematrix systemd[1]: matrix-postgres.service: Main process exited, code=exited, status=125/n/a Feb 04 13:09:42 thematrix systemd[1]: matrix-postgres.service: Failed with result 'exit-code'.
Was working fine without updating until my SSL cert expired. Now I'm back to dealing with trying to upgrade the server.
I've spent three days trying to work through this. I'm successfully completed the db update guide from JAN2021 to only the have issues with it binding to some stupid file in /tmp that performs tasks that don't need to be done on the db...
TASK [matrix-postgres : Execute Postgres additional database initialization SQL file for matrix_ma1sd] ********************************** fatal: [matrix.endofinternet.net]: FAILED! => {"changed": true, "cmd": ["/usr/bin/env", "docker", "run", "--rm", "--user=999:1001", "--cap-drop=ALL", "--env-file=/matrix/postgres/env-postgres-psql", "--network", "matrix", "--mount", "type=bind,src=/tmp/matrix-postgres-init-additional-db-user-and-role.sql,dst=/matrix-postgres-init-additional-db-user-and-role.sql,ro", "--entrypoint=/bin/sh", "docker.io/postgres:13.1-alpine", "-c", "psql -h matrix-postgres --file=/matrix-postgres-init-additional-db-user-and-role.sql"], "delta": "0:00:00.041447", "end": "2021-04-16 03:35:33.485407", "msg": "non-zero return code", "rc": 125, "start": "2021-04-16 03:35:33.443960", "stderr": "docker: Error response from daemon: invalid mount config for type \"bind\": bind source path does not exist: /tmp/matrix-postgres-init-additional-db-user-and-role.sql.\nSee 'docker run --help'.", "stderr_lines": ["docker: Error response from daemon: invalid mount config for type \"bind\": bind source path does not exist: /tmp/matrix-postgres-init-additional-db-user-and-role.sql.", "See 'docker run --help'."], "stdout": "", "stdout_lines": []}
that error is also listed in the issues on github, so I'm going with "There is something wrong with db migration scripts" cause at least a few of us have gotten wedged.
uhm.. wanting to test a workers change, I run
setup-all
which complained about my lack ofmatrix_postgres_connection_password
, which I then set.. then even nuked/moved the existing /matrix folder out of the way .. still failing, then I changed the password to a proper one and continued to run into this error inTASK [matrix-postgres : Execute Postgres additional database initialization SQL file for synapse]
:.. until it occurred to me that the running psql instance probably still had the outdated configuration.. Sure enough, stopping the container and re-running made it all work again :dango: ..too lazy to present a patch though xD