zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.34k stars 979 forks source link

Is there support to use custom LC_COLLATE and LC_CTYPE? #375

Open caleno opened 6 years ago

caleno commented 6 years ago

Hi.

Is there support to use custom LC_COLLATE and LC_CTYPE when specifying databases to be created?

It seems the docker image you provide don't support this? I guess you use the PostgreSQL Docker Community library as base for your docker image builds? E.g this: https://github.com/docker-library/postgres/blob/master/9.6/Dockerfile

I've tried to change the encoding in the manifest like this:

... patroni: initdb: encoding: "UTF8" locale: "nb_NO.UTF8" ....

I managed to build a docker image with support for nb_NO.UTF8 by using the Dockerfile referenced over and adding this:

localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8; \ localedef -i nb_NO -f UTF-8 nb_NO.UTF-8 But for what I know patroni uses it own docker registry.

So is there a solution for this?

Let me know I should post this issue in the patroni or spilo repo instead of this.

Br, Lennart

caleno commented 6 years ago

Manage to create my own spilo container with nb_NO.UTF8 locale support by editing the Docker.build file.

    && localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8 \
    && localedef -i nb_NO -f UTF-8 nb_NO.UTF-8 \

    # Add PGDG repositories
    && DISTRIB_CODENAME=$(sed -n 's/DISTRIB_CODENAME=//p' /etc/lsb-release) \
    && echo "deb http://apt.postgresql.org/pub/repos/apt/ ${DISTRIB_CODENAME}-pgdg main" > /etc/apt/sources.list.d/pgdg.list \
    && echo "deb-src http://apt.postgresql.org/pub/repos/apt/ ${DISTRIB_CODENAME}-pgdg main" >> /etc/apt/sources.list.d/pgdg.list \
    && curl -s -o - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - \

    && apt-get update \
    && apt-get install -y postgresql-common \

    # forbid creation of a main cluster when package is installed
    && sed -ri 's/#(create_main_cluster) .*$/\1 = false/' /etc/postgresql-common/createcluster.conf \

    # Clean up
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/* \
            /var/cache/debconf/* \
            /usr/share/doc \
            /usr/share/man \
            /usr/share/locale/?? \
            /usr/share/locale/??_?? \
    && find /var/log -type f -exec truncate --size 0 {} \; \
    && find /usr/share/i18n/charmaps/ -type f ! -name UTF-8.gz -delete \
    && find /usr/share/i18n/locales/ -type f ! -name en_US ! -name nb_NO -delete \
    && echo 'en_US.UTF-8 UTF-8' > /usr/share/i18n/SUPPORTED \
    && echo 'nb_NO.UTF-8 UTF-8' >> /usr/share/i18n/SUPPORTED```

Build
```./build.sh --build-arg COMPRESS=true -t uibit/spilo:latest .```

Pushed it to our registry at Docker Hub and then referencing to it in the manifest. 

However, initdb ignores my locale settings or is overruled somewhere. 

``` patroni:
    initdb:
      encoding: "UTF8"
      locale: "nb_NO.UTF8"
      data-checksums: "true"
    pg_hba:
    - hostssl all all 0.0.0.0/0 md5
    - host    all all 0.0.0.0/0 md5
    ttl: 30
    loop_wait: &loop_wait 10
    retry_timeout: 10
    maximum_lag_on_failover: 33554432
  # restore a Postgres DB with point-in-time-recovery 
  # with a non-empty timestamp, clone from an S3 bucket using the latest backup before the timestamp
  # with an empty/absent timestamp, clone from an existing alive cluster using pg_basebackup
  # clone:
  #  uid: "efd12e58-5786-11e8-b5a7-06148230260c"
  #  cluster: "acid-batman"
  #  timestamp: "2017-12-19T12:40:33+01:00" # timezone required (offset relative to UTC, see RFC 3339 section 5.6)
  maintenanceWindows:
  - 01:00-06:00 #UTC
  - Sat:00:00-04:00

I could create a database manually with no_NB.UTF8 so at least some progress. postgres=# create database test with template = template0 encoding = 'UTF8' LC_COLLATE = 'nb_NO.UTF8' LC_CTYPE = 'nb_NO.UTF8';

I'd like the option to add parameters to the cluster manifest, e.g. LC_COLLATE and LC_CTYPE


kind: postgresql

metadata:
  name: pg-test-cluster
  namespace: test
spec:
  teamId: "TEST"
  volume:
    size: 1Gi
  numberOfInstances: 1
  users: #Application/Robot users
    test:
    - login
  enableMasterLoadBalancer: true
  enableReplicaLoadBalancer: false
  allowedSourceRanges: # load balancers' source ranges for both master and replica services
  - 127.0.0.1/32
  databases:
    test: test
#Expert section
  postgresql:
    version: "9.6"
    parameters:
      shared_buffers: "32MB"
      max_connections: "10"
      log_statement: "all"
  resources:
    requests:
      cpu: 10m
      memory: 100Mi
    limits:
      cpu: 300m
      memory: 3000Mi
  patroni:
    initdb:
      encoding: "UTF8"
      locale: "nb_NO.UTF8"   <--- ignored
      data-checksums: "true"
    pg_hba:
    - hostssl all all 0.0.0.0/0 md5
    - host    all all 0.0.0.0/0 md5
    ttl: 30
    loop_wait: &loop_wait 10
    retry_timeout: 10
    maximum_lag_on_failover: 33554432
  # restore a Postgres DB with point-in-time-recovery 
  # with a non-empty timestamp, clone from an S3 bucket using the latest backup before the timestamp
  # with an empty/absent timestamp, clone from an existing alive cluster using pg_basebackup
  # clone:
  #  uid: "efd12e58-5786-11e8-b5a7-06148230260c"
  #  cluster: "acid-batman"
  #  timestamp: "2017-12-19T12:40:33+01:00" # timezone required (offset relative to UTC, see RFC 3339 section 5.6)
  maintenanceWindows:
  - 01:00-06:00 #UTC
  - Sat:00:00-04:00```

Or do I have to hardcode this into the spilo container. Might be the easiest way if there currently in no other option. 
alexeyklyukin commented 6 years ago

@caleno I think what happens here is that initdb is not ignored in the operator, and the operator generates the SPILO_CONFIGURATION environment variable with the correct value (see your statefulset definition for the cluster created by the operator) for bootstrap:initdb:locale; however, Spilo sets it by default to en_US.UTF-8 (https://github.com/zalando/spilo/blob/9b44a5d03074a343e1d3c5b35c79c2a4d85c7e2d/postgres-appliance/scripts/configure_spilo.py#L175), and when merging SPILO_CONFIGURATION with the default one, it appends the default initdb options to the list of custom ones, obtained from the SPILO_CONFIGURATION (initdb options is a list, since some of those doesn't have any values, i.e. --data-checksums). Patroni happily processes this list and puts to --locale options, and the last one (with en_US.UTF-8) wins.

So there are a few things we could do better here:

Jan-M commented 6 years ago

So cool to ask for Norwegian locale :) 🇳🇴

caleno commented 6 years ago

Thanks for your reply @alexeyklyukin. I'll get back to this case later on when I got some more time on my hands. But one thing that would be great regarding any locale support would be something like this.

    test: test
      locale: 
        lc_collate: "nb_NO.UTF8"
        lc_ctype: "nb_NO.UTF8"
#Expert section
  postgresql:

I don't want to make a big issue out of this, see it as a feature, nice to have, request ;)

alexeyklyukin commented 6 years ago

@caleno I think it would be more clear to continue passing those in the intidb section of Patroni configuration, so that we don't invent an additional translation layer between the operator options and Patroni ones. Everything is actually in place but the Spilo support for merging user-specified initdb options in a sane way, but it should be rather easy to fix.

kimxogus commented 5 years ago

Is there any update about this issue? I really hope to customize LC_COLLATE in postgres-operator.

paalkr commented 4 years ago

@alexeyklyukin , any news regarding this issue? I'm actually also looking for a way to use locale: "nb_NO.UTF-8"

immanuelfodor commented 4 years ago

It turns out Matrix needs C.C to even start up: https://github.com/matrix-org/synapse/blob/master/docs/postgres.md#set-up-database

Synapse will refuse to set up a new database if it has the wrong values of COLLATE and CTYPE set, and will log warnings on existing databases.

LC_COLLATE='C'
LC_CTYPE='C'

This is such a great operator for PostgreSQL, I'm so sad that I can't just do in my cluster config:

   patroni:
     initdb:
       encoding: C
       locale: C
erhan- commented 2 years ago

Same setting as @immanuelfodor . Unfortunately stuck with the synapse deployment for matrix.

immanuelfodor commented 2 years ago

I ended up removing the operator and went for an external PG deployment outside of the cluster 😭

erhan- commented 2 years ago

I dropped the database but was not able to create a new one via following command to have the right values:

createdb --encoding=UTF8 --locale=C --template=template0 --owner=matrix matrix;

\l does not about anything and the database is gone.

edit: createdb is not a psql command. You have to do it like this:

your-cluster is the name of your postgres cluster.

kubectl port-forward your-cluster-0 6432:5432

and then in a new terminal:

export PGPASSWORD=$(kubectl get secret postgres.your-cluster.credentials.postgresql.acid.zalan.do -o 'jsonpath={.data.password}' | base64 -d)
export PGSSLMODE=require

psql -U matrix -d postgres -h localhost -p 6432

and then

drop database matrixdb;
create database matrixdb with template=template0 owner=matrix encoding=UTF8 locale='C';

Works so far.

hbjydev commented 2 years ago

Bumping this, it's been open 2 years... Can anyone spare time to look into this?

TheEagleByte commented 2 years ago

Exec into the master pod of your cluster in k8s and run the following:

psql
DROP DATABASE "synapse-db";
CREATE DATABASE "synapse-db" with template=template0 owner="synapse-db" encoding=UTF8 locale='C';
GRANT ALL PRIVILEGES ON DATABASE "synapse-db" to "synapse-db";

Obviously, replace the user/role/db name with the appropriate information. You can get those values by running \l to get the db name and \du to get the user/role name.

SNThrailkill commented 1 year ago

This is a common requirement across many services. Can we please get somebody to look into this?

haslersn commented 1 year ago

Does the following not work? That's how we set up our PostgreSQL for Synapse 23 months ago.

  patroni:
    initdb:
      lc-collate: C
      lc-ctype: C
      encoding: UTF8
fraanek commented 1 year ago

To use custom locale you must build a custom spilo image. There is a environment variable ADDITIONAL_LOCALES in Dockerfile.

For example, when you build image like this:

cd postgres-appliance
docker build  --tag $YOUR_TAG --build-arg ADDITIONAL_LOCALES=pl_PL .

Then you can use this locale in postgresql manifest:

patroni:
  initdb:
    encoding: "UTF8"
    locale: "pl_PL.UTF-8"

You can add to image more than one additional language.

benedikt-bartscher commented 1 year ago

any news on this?

apristup commented 5 months ago

add environment LC_ALL= pl_PL.utf8 all work