Alma linux 9 DO droplet doesn't run through

RussellTaylor83 commented 7 months ago

Hello,

I thought I would stick with a RHEL distro as I use it for work. I have used a Digital Ocean Alma Linux 9 droplet, and I can ssh to it as root. There seem to be a few hurdles, and I then have postgres file permissions failures I can't overcome. I'm not sure if it's a problem with what I'm doing or something to try and resolve. I run the lemmy-almalinux.yml and get

RUNNING HANDLER [Reload nginx] **** fatal: [root@mydomain]: FAILED! => {"changed": false, "msg": "Unable to start service nginx: Job for nginx.service failed because the control process exited with error code.\nSee \"systemctl status nginx.service\" and \"journalctl -xeu nginx.service\" for details.\n"}

This is because the port is already in use by another nginx process, that must have been started as part of the ansible script.

I pkill -f nginx and run again... next time it runs straight through. I get a 502, so checking podman-compose logs it shows the postgres container lasting a few seconds with many logs saying

26d97800a297 PostgreSQL Database directory appears to contain a database; Skipping initialization 26d97800a297 26d97800a297 postgres: could not access the server configuration file "/etc/postgresql.conf": Permission denied

I have attempted chmod'ing the customPostgresql.conf file, moving it to /tmp and mounting from there, and can't seem to get past this permissions error.

If I docker compose down and then up, the postgres section of the logs look like this (I have replaced the domain I'm using with "mydomain")

podman create --name=mydomain_lemmy-ui_1 --requires=mydomain_postgres_1,mydomain_lemmy_1,mydomain_pictrs_1 --label io.podman.compose.config-hash=a94348b05f02f7bcf820ab20379b950f9bfd5ac885905141f8d5a38a3c863756 --label io.podman.compose.project=mydomain --label io.podman.compose.version=1.0.6 --label PODMAN_SYSTEMD_UNIT=podman-compose@mydomain.service --label com.docker.compose.project=mydomain --label com.docker.compose.project.working_dir=/srv/lemmy/mydomain --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=lemmy-ui -e LEMMY_UI_LEMMY_INTERNAL_HOST=lemmy:8536 -e LEMMY_UI_LEMMY_EXTERNAL_HOST=mydomain -e LEMMY_UI_HTTPS=True -v /srv/lemmy/mydomain/volumes/lemmy-ui/extra_themes:/app/extra_themes --net mydomain_default --network-alias lemmy-ui --log-driver=json-file --log-opt=max-size=50m --log-opt=max-file=4 --restart always docker.io/dessalines/lemmy-ui:0.19.3 c031f4d65f2348c7e1576bbdb91c7963b5c117a55622bb0603479821f0e09a9d exit code: 0 ['podman', 'network', 'exists', 'mydomain_default'] podman create --name=mydomain_proxy_1 --requires=mydomain_postgres_1,mydomain_lemmy_1,mydomain_lemmy-ui_1,mydomain_pictrs_1 --label io.podman.compose.config-hash=a94348b05f02f7bcf820ab20379b950f9bfd5ac885905141f8d5a38a3c863756 --label io.podman.compose.project=mydomain --label io.podman.compose.version=1.0.6 --label PODMAN_SYSTEMD_UNIT=podman-compose@mydomain.service --label com.docker.compose.project=mydomain --label com.docker.compose.project.working_dir=/srv/lemmy/mydomain --label com.docker.compose.project.config_files=docker-compose.yml --label com.docker.compose.container-number=1 --label com.docker.compose.service=proxy -v /srv/lemmy/mydomain/nginx_internal.conf:/etc/nginx/nginx.conf:Z,ro -v /srv/lemmy/mydomain/proxy_params:/etc/nginx/proxy_params:Z,ro --net mydomain_default --network-alias proxy --log-driver=json-file --log-opt=max-size=50m --log-opt=max-file=4 -p 21783:8536 --restart always docker.io/library/nginx c4a0a353c5de694bbc665bf2148b8e54a286661e51a8947eff8fdb3e69114da9 exit code: 0 podman start -a mydomain_pictrs_1 [pictrs] | 2024-02-28T21:35:42.302447Z INFO actix_server::builder: starting 2 workers [pictrs] | 2024-02-28T21:35:42.302491Z INFO actix_server::server: Tokio runtime found; starting in existing Tokio runtime podman start -a mydomain_postgres_1 [postgres] | [postgres] | PostgreSQL Database directory appears to contain a database; Skipping initialization [postgres] | 2024-02-28 21:35:43.315 GMT [1] LOG: starting PostgreSQL 15.6 on x86_64-pc-linux-musl, compiled by gcc (Alpine 13.2.1_git20231014) 13.2.1 20231014, 64-bit 2024-02-28 21:35:43.315 GMT [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 2024-02-28 21:35:43.316 GMT [1] LOG: listening on IPv6 address "::", port 5432 2024-02-28 21:35:43.324 GMT [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2024-02-28 21:35:43.330 GMT [19] LOG: database system was shut down at 2024-02-28 21:35:30 GMT 2024-02-28 21:35:43.338 GMT [1] LOG: database system is ready to accept connections podman start -a mydomain_postfix_1 podman start -a mydomain_lemmy_1 [lemmy] | Lemmy v0.19.3 2024-02-28 21:35:45.350 GMT [23] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption thread 'main' panicked at crates/db_schema/src/utils.rs:281:56: Error connecting to postgres://lemmy:D1cj90jNvhcBgeDcadYI@postgres:5432/lemmy: connection to server at "postgres" (10.89.0.16), port 5432 failed: FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption note: run with RUST_BACKTRACE=1 environment variable to display a backtrace 2024-02-28 21:35:45.578 GMT [24] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption 2024-02-28 21:35:45.755 GMT [25] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption 2024-02-28 21:35:45.921 GMT [26] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption podman start -a mydomain_lemmy-ui_1 2024-02-28 21:35:46.145 GMT [27] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption Error: unable to start container c031f4d65f2348c7e1576bbdb91c7963b5c117a55622bb0603479821f0e09a9d: generating dependency graph for container c031f4d65f2348c7e1576bbdb91c7963b5c117a55622bb0603479821f0e09a9d: container 7d0273c2bfc48644e2edc84216a3c2b282b3c6ff2b2d10156ddd1a83d2d1cea8 depends on container b5af5adb9aaabea5eb289b6b89fd569b569ac7a936baaf3820362ff7496b8137 not found in input list: no such container exit code: 125 exit code: 101 2024-02-28 21:35:46.335 GMT [28] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption 2024-02-28 21:35:46.524 GMT [29] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption 2024-02-28 21:35:46.715 GMT [30] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption 2024-02-28 21:35:46.962 GMT [31] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption podman start -a mydomain_proxy_1 [postfix] | Starting Postfix Mail Transport Agent: postfix. [postfix] | 2024-02-28T21:35:47.258629+00:00 d9f266633d96 rsyslogd: [origin software="rsyslogd" swVersion="8.1901.0" x-pid="125" x-info="https://www.rsyslog.com"] start [postfix] | 2024-02-28T21:35:47.271327+00:00 d9f266633d96 postfix/master[123]: daemon started -- version 3.4.23, configuration /etc/postfix Error: unable to start container c4a0a353c5de694bbc665bf2148b8e54a286661e51a8947eff8fdb3e69114da9: generating dependency graph for container c4a0a353c5de694bbc665bf2148b8e54a286661e51a8947eff8fdb3e69114da9: container c031f4d65f2348c7e1576bbdb91c7963b5c117a55622bb0603479821f0e09a9d depends on container b5af5adb9aaabea5eb289b6b89fd569b569ac7a936baaf3820362ff7496b8137 not found in input list: no such container exit code: 125 2024-02-28 21:35:47.332 GMT [32] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption 2024-02-28 21:35:47.497 GMT [33] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption 2024-02-28 21:35:47.680 GMT [34] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption 2024-02-28 21:35:47.846 GMT [35] FATAL: no pg_hba.conf entry for host "10.89.0.18", user "lemmy", database "lemmy", no encryption .... and that repeats forever

I might have a go with a Debian or something and see how I get on, but I was hoping to stick with alma.

Many thanks

RussellTaylor83 commented 7 months ago

I just tried the deployment I configured but pointing at Debian 12 and it worked first time.

codyro commented 7 months ago

Thanks for the report! I'll look into this and get it fixed up by tomorrow!

codyro commented 7 months ago

Hello,

I thought I would stick with a RHEL distro as I use it for work. I have used a Digital Ocean Alma Linux 9 droplet, and I can ssh to it as root. There seem to be a few hurdles, and I then have postgres file permissions failures I can't overcome. I'm not sure if it's a problem with what I'm doing or something to try and resolve. I run the lemmy-almalinux.yml and get

RUNNING HANDLER [Reload nginx] **** fatal: [root@mydomain]: FAILED! => {"changed": false, "msg": "Unable to start service nginx: Job for nginx.service failed because the control process exited with error code.\nSee "systemctl status nginx.service" and "journalctl -xeu nginx.service" for details.\n"}

This is caused by certbot running before nginx is started. Since we're using the Certbot nginx plugin (--nginx) when requesting the certificate, it starts nginx itself, causing the port conflict later in the playbook when we try to start nginx via systemd.

I'll fix this up in another PR by ensuring nginx is started & enabled before the certificate request (https://github.com/LemmyNet/lemmy-ansible/compare/main...codyro:lemmy-ansible:almalinux-9-fixes). That is why the playbook errors out until you manually kill the nginx processes (EX, pkill -9 nginx).

26d97800a297 PostgreSQL Database directory appears to contain a database; Skipping initialization 26d97800a297 26d97800a297 postgres: could not access the server configuration file "/etc/postgresql.conf": Permission denied

I have attempted chmod'ing the customPostgresql.conf file, moving it to /tmp and mounting from there, and can't seem to get past this permissions error.

That is likely caused by SELinux being enabled. It's beyond the scope of this playbook to manage that however you can temporarily disable it to see if it resolves your issue by running setenforce 0. You can make the change persistent by editing /etc/sysconfig/selinux.

While going over this, I found another issue affecting the RHEL playbook introduced in #209. The change in that pull request assumes that we're using Docker, docker-compose, and using their internal resolver (127.0.0.11:53). podman uses its network/resolver (generally 10.89.0.1:53) (and ensures the container uses it by adjusting the /etc/resolv.conf).

We'll need to consider how to work around this. @ticoombs, do you have any suggestions on how to handle this better?

In the interim, you can fix it by adjusting the templates/nginx_internal.conf file and changing the line:

resolver 127.0.0.11 valid=5s;

to

resolver 10.89.0.1 valid=5s;

And re-run the playbook/restart the *_proxy_1 container.

Fresh install after the above changes: https://foss.ly/

RussellTaylor83 commented 7 months ago

That's probably the fastest, most concise response I've ever seen on Github!

If you get a branch up at some point and would like me to test I'd be happy to.

Thank you

codyro commented 7 months ago

@RussellTaylor83 If you'd like to give this a whirl, it'd be much appreciated! It runs cleanly on a fresh AlmaLinux 9 (or variants).

https://github.com/LemmyNet/lemmy-ansible/pull/231

codyro commented 6 months ago

This was fixed and merged in https://github.com/LemmyNet/lemmy-ansible/pull/231. Thanks for the report & testing it out.

It'll be in the next tagged release (or in HEAD) :).

LemmyNet / lemmy-ansible

Alma linux 9 DO droplet doesn't run through #229