immich-app / immich-charts

Helm chart implementation of Immich
https://immich.app
GNU Affero General Public License v3.0
106 stars 45 forks source link

v1.91.0 DB Crash #55

Open evanreichard opened 8 months ago

evanreichard commented 8 months ago

Postgres Pod Crash:

chmod: changing permissions of '/var/run/postgresql': Operation not permitted

PostgreSQL Database directory appears to contain a database; Skipping initialization

postgres: could not access the server configuration file "/bitnami/postgresql/data/postgresql.conf": No such file or directory

Chart Values:

    env:
      DB_PASSWORD:
        valueFrom:
          secretKeyRef:
            name: postgres-secrets
            key: password
    image:
      tag: v1.91.0
    immich:
      persistence:
        library:
          existingClaim: va-unraid-photos-rw
    postgresql:
      enabled: true
      auth:
        existingSecret: postgres-secrets
    redis:
      enabled: true

The postgres statefulset is appropriately configured with the following image:

        image: docker.io/tensorchord/pgvecto-rs:pg14-v0.1.11

Chart version: immich-0.3.0

inglemr commented 8 months ago

I ran into this as well. For what it is worth, in the meantime, I was able to resolve this by creating the missing config file "/bitnami/postgresql/data/postgresql.conf" and "/bitnami/postgresql/data/pg_hba.conf". I simply took a default config file and it allowed the database to come back online.

evanreichard commented 8 months ago

Thanks @inglemr

Inevitably I needed the following:

postgresql.conf

listen_addresses = '*'

pg_hba.conf

local    all             all                                     trust
host     all             all        10.0.0.0/8                   md5
host     all             all        127.0.0.1/32                 trust
host     all             all        ::1/128                      trust

I tried using the default in the bitnami images:

# Export Default Configuration
docker run -it --entrypoint /bin/bash bitnami/postgresql:14.10.0 -c "cat /opt/bitnami/postgresql/conf/postgresql.conf" > postgresql.conf
docker run -it --entrypoint /bin/bash bitnami/postgresql:14.10.0 -c "cat /opt/bitnami/postgresql/conf/pg_hba.conf" > pg_hba.conf

And while postgres came online, the other services couldn't connect to it. So I modified the above to include the changes I first mentioned.

If that's all you do, you'll need to also ensure a conf.d directory exists (referenced by the default postgresql.conf file):

kubectl exec -n immich immich-helmrelease-postgresql-0 -- mkdir /bitnami/postgresql/data/conf.d

And to copy the local file to the pod:

kubectl -n immich cp postgresql.conf immich-helmrelease-postgresql-0:/bitnami/postgresql/data
kubectl -n immich cp pg_hba.conf immich-helmrelease-postgresql-0:/bitnami/postgresql/data
ViktorBarzin commented 8 months ago

Same for me. Described solution also worked. However immich-server still expects TYPESENSE_API_KEY and crash loops....

/usr/src/app/node_modules/@nestjs/config/dist/config.module.js:78
                throw new Error(`Config validation error: ${error.message}`);
                ^

Error: Config validation error: "TYPESENSE_API_KEY" is required
    at ConfigModule.forRoot (/usr/src/app/node_modules/@nestjs/config/dist/config.module.js:78:23)
    at Object.<anonymous> (/usr/src/app/dist/infra/infra.module.js:49:27)
    at Module._compile (node:internal/modules/cjs/loader:1241:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1295:10)
    at Module.load (node:internal/modules/cjs/loader:1091:32)
    at Module._load (node:internal/modules/cjs/loader:938:12)
    at Module.require (node:internal/modules/cjs/loader:1115:19)
    at require (node:internal/modules/helpers:130:18)
    at Object.<anonymous> (/usr/src/app/dist/infra/index.js:19:14)
    at Module._compile (node:internal/modules/cjs/loader:1241:14)

Node.js v20.8.1
frankprimarily commented 8 months ago

This seems to be caused by using a docker images based on postgres/postgres and injecting it into a chart dependency that expects a bitnami version.

dbeltman commented 8 months ago

Thanks @inglemr

Inevitably I needed the following:


pg_hba.conf

local    all             all                                     trust
host     all             all        10.0.0.0/8                   md5
host     all             all        127.0.0.1/32                 trust
host     all             all        ::1/128                      trust

I had to mod this a little to match my pod CIDR, and put "password" instead of "md5" since the server kept nagging about no pg_hba.conf entry for "no encryption"

alexandresoro commented 8 months ago

This one caught me off guard too. This is indeed because bitnami images have a lot of init done.

The config paths are passed at postgres startup here: https://github.com/bitnami/containers/blob/7bc9cc3e18c0c08c8019d3f6c8dd6bc0f926051e/bitnami/postgresql/16/debian-11/rootfs/opt/bitnami/scripts/postgresql/run.sh#L19 And the config is built dynamically at startup from https://github.com/bitnami/containers/blob/7bc9cc3e18c0c08c8019d3f6c8dd6bc0f926051e/bitnami/postgresql/16/debian-11/rootfs/opt/bitnami/scripts/libpostgresql.sh#L170 So yes, copying the content from /opt/bitnami/postgresql/conf to /bitnami/postgresql/data does the trick. If your instance is running and the volume is mounted, I guess you can simply copy-paste that.

Also, from what I see, it seems that the default UID is different: 1001 for the bitnami image, and 999 for the default postgres one. You might wish to adjust that with something like:

postgresql:
  image:
    repository: tensorchord/pgvecto-rs
    tag: pg16-v0.1.11

  primary:
    podSecurityContext:
      fsGroup: 999

    containerSecurityContext:
      runAsUser: 999

  volumePermissions:
    enabled: true # By setting this, you make sure that the existing data get chmod'd to UID 999
frankprimarily commented 8 months ago

As this is caused by replacing the used image, this might only be an issue for a migration to the 1.91+ immich release. Has anyone tried to deploy from scratch (empty pvc)? If this is working, we probably can close this issue - or we implement some kind of migration script.

LordGaav commented 8 months ago

Switching from a Bitnami to a normal PostgreSQL image is a really bad idea, every user that updates this Helm chart is going to run into this exact issue. This caught me off guard because there was no breaking change warning about this, only about Immich itself which mentions docker-compose but not Helm, and the section about 'normal' Postgres images also does not mention it.

I'm looking into what needs to be done to fix this for myself, ~because this also involves a PostgreSQL update from 11 to 14.~

Edit: in the end I ended up rolling back the Helm chart to 0.2.0 and the immich tag to v1.90.2 . There is no easy way to install the pgvector.rs extension in the Bitnami image without also running PostgreSQL as root as far as I can see.

SoarinFerret commented 8 months ago

quick repo I threw up with a bitnami compatible build of pgvecto.rs: https://github.com/SoarinFerret/bitnami-postgres-pgvecto-rs

This does NOT run the PostgreSQL as root - so should fix your concerns @LordGaav

if you update your values.yaml to the following:

postgressql:
  image:
    registry: ghcr.io
    repository: soarinferret/bitnami-postgres-pgvecto-rs
    tag: pg14.5-v0.1.11

you can update to the latest release. Hope this helps someone. I think long term my plan will be to stop running this in k8s, so I don't intend to keep the repo updated.

Here is the dockerfile for anyone who may want to build themselves:

ARG PGVECTORS_TAG=pg14-v0.1.11-amd64
ARG BITNAMI_TAG=14.5.0-debian-11-r6
FROM scratch as nothing
FROM tensorchord/pgvecto-rs-binary:${PGVECTORS_TAG} as binary

FROM docker.io/bitnami/postgresql:${BITNAMI_TAG}
COPY --from=binary /pgvecto-rs-binary-release.deb /tmp/vectors.deb
USER root
RUN apt-get install -y /tmp/vectors.deb && rm -f /tmp/vectors.deb && \
     mv /usr/lib/postgresql/*/lib/vectors.so /opt/bitnami/postgresql/lib/ && \
     mv usr/share/postgresql/*/extension/vectors* opt/bitnami/postgresql/share/extension/
USER 1001
ENV POSTGRESQL_EXTRA_FLAGS="-c shared_preload_libraries=vectors.so"
LordGaav commented 8 months ago

Can confirm your image works @SoarinFerret , thanks. I only had to do CREATE EXTENSION vectors on the immich database.

alexbarcelo commented 8 months ago

I had some long hours of frustration with strange permissions errors, version mismatching and some issues with @SoarinFerret fix (don't get me wrong, thanks a lot for providing the image, I am not sure why it was not working me, it worked for other people).

My more drastic fix was to perform a backup & restore:

I hope someone benefits from this. If you already have backups in place, this whole procedure can be done in under 10 minutes. Otherwise, you may need some back&forth (maybe some rolling back and trying again). Good luck!

EDIT: DISCLAIMER: Those steps can break your installation and/or you can lose data, make sure to know what you are doing, make a backup before even trying, if you are not sure of the procedure train and dry-run it with a playground environment in order to try it in a safe manner, understand all the steps before attempting them, etc.

Nepoxx commented 8 months ago

Thanks @inglemr

Inevitably I needed the following:

postgresql.conf

listen_addresses = '*'

pg_hba.conf

local    all             all                                     trust
host     all             all        10.0.0.0/8                   md5
host     all             all        127.0.0.1/32                 trust
host     all             all        ::1/128                      trust

I tried using the default in the bitnami images:

# Export Default Configuration
docker run -it --entrypoint /bin/bash bitnami/postgresql:14.10.0 -c "cat /opt/bitnami/postgresql/conf/postgresql.conf" > postgresql.conf
docker run -it --entrypoint /bin/bash bitnami/postgresql:14.10.0 -c "cat /opt/bitnami/postgresql/conf/pg_hba.conf" > pg_hba.conf

And while postgres came online, the other services couldn't connect to it. So I modified the above to include the changes I first mentioned.

If that's all you do, you'll need to also ensure a conf.d directory exists (referenced by the default postgresql.conf file):

kubectl exec -n immich immich-helmrelease-postgresql-0 -- mkdir /bitnami/postgresql/data/conf.d

And to copy the local file to the pod:

kubectl -n immich cp postgresql.conf immich-helmrelease-postgresql-0:/bitnami/postgresql/data
kubectl -n immich cp pg_hba.conf immich-helmrelease-postgresql-0:/bitnami/postgresql/data

How can you run those if the container is in a crash loop?

evanreichard commented 8 months ago

@Nepoxx hacky, not sure if theres a better way, but I edit the StatefulSet container definition with:

command: ["sleep"]
args: ["infinity"]

Then do my changes, then remove the edit.

PixelJonas commented 6 months ago

Ufff .... there are so many things to unpack here.

First of all, @all who are currently contributing to this thread: THANK YOU VERY MUCH 🙏

I finally found some time to upgrade my cluster apps and I knew that upgrading immich will be "a thing" which is why I did not do it for a while. Your comments and solutions were tremendous in helping me.

Honestly, I don't like the inclusion of the database for immich in this chart. The Sub-Charts of Bitnami have no way of upgrading between major versions, so we inherited the breaking changes that will come with it. with #61 we'd start building our own database image which would lead to us making sure this is updated and maintained and again - still breaking changes between major versions.

I did a LOT of manual steps to get the database for immich working, which included finding the correct pgvector image to use 🙈

The pgvecto.rs extension version is 0.2.0 instead of 0.1.11.
Please run 'DROP EXTENSION IF EXISTS vectors' and switch to 0.1.11, such as with the docker image 'tensorchord/pgvecto-rs:pg16-v0.1.11'.

This seems like there is an endorsement for an image to use as a database from the main project and I'd like us to use the same and/or have a collaborative effort to create an image on the main account of this project (rather than adding an Image to this Helm-Chart).

I'd also like us to condense the "manual migration" from the bitnami chart to this new version in a CHANGELOG and introduce it as a breaking change to this chart.

What are your thoughts about that? @bo0tzz you have any input?

bo0tzz commented 6 months ago

Sorry about the breaking change and the silence afterwards folks! Unfortunately the way we have the helm chart releases set up right now means there is very little testing, and also no easy way to communicate (breaking) changes. On top of that, I got lazy and did too little manual testing when making this breaking release.

Like Jonas mentions, I'm also not a fan of including the database in this chart. On the other hand, I can see the argument for being able to deploy Immich easily without needing to set up several (postgres, redis) external dependencies manually.

I'm not sure what the best way forward is, but if possible I would like to avoid needing to maintain a(nother) database image. The upstream pgvecto.rs image is perfectly good other than not playing nice with the bitnami postgres chart (for which I blame bitnami). Since we're running on kubernetes, there's also the option of database operators with for example https://github.com/tensorchord/cloudnative-pgvecto.rs (for cloudnative-pg) or https://github.com/chkpwd/cdpgvecto.rs (for crunchydata PGO).

Finally, the release process of this chart needs to be improved as currently it just releases on every merge to main. Ideally releases should happen through Github Releases instead, and include proper changelogs and such. Any help with that would be very welcome.

martimors commented 5 months ago

Doesn't the chart help configure postgresql with the geodata-plugin etc.? I think for that reason it's convenient to include it, although I would personally not mind running my own bitnami-postgres alongside immich.