goharbor / harbor-helm

The helm chart to deploy Harbor
Apache License 2.0
1.17k stars 758 forks source link

harbor-database pod is failing to start with bus error #1789

Open JagatguruMishra opened 1 month ago

JagatguruMishra commented 1 month ago

I am installing harbor with helm chart version 1.12.0 (https://github.com/goharbor/harbor-helm/archive/refs/tags/v1.12.0.tar.gz). Pod 'harbor-database-0' is going in CrashLoopBackOff status and some other pods dependent on this are going to the same state. Below are the logs.

$ kubectl logs harbor-database-0 -n harbor
Defaulted container "database" out of: database, data-migrator (init), data-permissions-ensurer (init)
init DB, DB version:13
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locales
  COLLATE:  en_US.UTF-8
  CTYPE:    en_US.UTF-8
  MESSAGES: C
  MONETARY: C
  NUMERIC:  C
  TIME:     C
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /var/lib/postgresql/data/pgdata/pg13 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
sh: line 1:     9 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=100 -c shared_buffers=1000 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    11 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=50 -c shared_buffers=500 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    13 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=40 -c shared_buffers=400 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    23 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=30 -c shared_buffers=300 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
selecting default max_connections ... 20
sh: line 1:    25 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=200 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    27 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=16384 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    29 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=8192 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    39 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=4096 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    41 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=3584 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    43 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=3072 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    45 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=2560 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    47 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=2048 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    49 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=1536 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    51 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=1000 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    53 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=900 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    55 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=800 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    57 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=700 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    59 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=600 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    61 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=500 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    63 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=400 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    65 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=300 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    67 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=200 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    77 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=100 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
sh: line 1:    79 Bus error               (core dumped) "/usr/pgsql/13/bin/postgres" --boot -x0 -F -c max_connections=20 -c shared_buffers=50 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1
selecting default shared_buffers ... 400kB
selecting default time zone ... UTC
creating configuration files ... ok
child process was terminated by signal 7: Bus error
initdb: removing data directory "/var/lib/postgresql/data/pgdata/pg13"
running bootstrap script ... 

The issue is with the huge pages. If I provide the hugepages resource to the pod like 'hugepages-1Gi: 1' or disable the huge pages on the node, it works.

Is there any way to provide the configuration in harbor-db pod/statefulset to disable hugepages (setting huge_pages = off ) in .conf file.

zyyw commented 1 month ago

There are some more details of the solution / workaround regarding this huge_pages issue, here:

The huge_pages = off configure might need to be apply to this file with code change similar like below:

sed -i -r 's/#huge_pages.*?/huge_pages = off/g' /usr/pgsql/15/share/postgresql/postgresql.conf.sample

In that case, the huge_pages will be turned off to all users, even those who want huge_pages to be try or on. Having huge_pages = off may have performance issue. As it states here:

The use of huge pages results in smaller page tables and less CPU time spent on memory management, increasing performance. 
JagatguruMishra commented 1 month ago

Thanks @zyyw. I am using the image goharbor/harbor-db:v2.8.0, so correct file is /usr/pgsql/13/share/postgresql/postgresql.conf.sample. I don't prefer to build an image due to maintenamce concers, also I can't disable hugepages on the host system or use the available huge pages. I am looking for a possibility where I can disable hugepages by modifying the statefulset, may be by adding any env value or running some commands in init container or any other way to configure it.

gera-aldama commented 2 weeks ago

Hello @JagatguruMishra, I'm currently facing the same issue and looking into not having to modify the image nor disabling hugepages on the host system, were you able to find a suitable solution? Thanks.

JagatguruMishra commented 1 week ago

Hi @gera-aldama I was able to apply below workaround.

  1. Create a config map with all the default values of postgresql.conf.sample taken from /usr/pgsql/13/share/postgresql location of the container started from the image goharbor/harbor-db:v2.8.0 and setting 'huge_pages = off'.
  2. Mount this at /usr/pgsql/13/share/postgresql/postgresql.conf.sample to statefulset in templates/database/database-ss.yaml so that postgress initdb disables the huge pages.