vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.
MIT License
1.27k stars 340 forks source link

Postgres Pro cannot start by patroni #602

Closed iabdukhoshimov closed 1 month ago

iabdukhoshimov commented 1 month ago

1 patroni[28624]: 2024-03-20 16:52:03 +05 [28624-2] HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 123243331584 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.

iabdukhoshimov commented 1 month ago
postgresql_parameters:
  - { option: "max_connections", value: "100" }
  - { option: "superuser_reserved_connections", value: "5" }
  - { option: "password_encryption", value: "{{ postgresql_password_encryption_algorithm }}" }
  - { option: "max_locks_per_transaction", value: "512" }
  - { option: "max_prepared_transactions", value: "0" }
  - { option: "huge_pages", value: "on" }  # or "on" if you set "vm_nr_hugepages" in kernel parameters
  - { option: "shared_buffers", value: "112GB" }  # by default, 25% of RAM
  - { option: "effective_cache_size", value: "{{ (ansible_memtotal_mb * 0.75) | int }}MB" }  # by default, 75% of RAM
  - { option: "work_mem", value: "256MB" }  # please change this value
  - { option: "maintenance_work_mem", value: "2GB" }  # please change this value
  - { option: "checkpoint_timeout", value: "15min" }
  - { option: "checkpoint_completion_target", value: "0.9" }
  - { option: "min_wal_size", value: "2GB" }
  - { option: "max_wal_size", value: "32GB" }  # or 16GB/32GB
  - { option: "wal_buffers", value: "32MB" }
  - { option: "default_statistics_target", value: "1000" }
  - { option: "seq_page_cost", value: "1" }
  - { option: "random_page_cost", value: "1.1" }  # or "4" for HDDs with slower random access
  - { option: "effective_io_concurrency", value: "200" }  # or "2" for traditional HDDs with lower I/O parallelism
  - { option: "synchronous_commit", value: "on" }  # or 'off' if you can you lose single transactions in case of a crash
  - { option: "autovacuum", value: "on" }  # never turn off the autovacuum!
  - { option: "autovacuum_max_workers", value: "5" }
  - { option: "autovacuum_vacuum_scale_factor", value: "0.01" }  # or 0.005/0.001
  - { option: "autovacuum_analyze_scale_factor", value: "0.01" }
  - { option: "autovacuum_vacuum_cost_limit", value: "500" }  # or 1000/5000
  - { option: "autovacuum_vacuum_cost_delay", value: "2" }
  - { option: "autovacuum_naptime", value: "1s" }
  - { option: "max_files_per_process", value: "4096" }
  - { option: "archive_mode", value: "on" }
  - { option: "archive_timeout", value: "1800s" }
  - { option: "archive_command", value: "cd ." } 
vitabaks commented 1 month ago

@iabdukhoshimov Hi!

Make sure there are enough vm.nr_hugepages for 112GB shared_buffers since you have huge_pages enabled.

Check:

SHARED_BUFFERS_SIZE_GB=112

HUGE_PAGE_SIZE_KB=$(awk '/Hugepagesize/ {print $2}' /proc/meminfo)
HUGE_PAGES_TOTAL=$(awk '/HugePages_Total/ {print $2}' /proc/meminfo)
HUGE_PAGES_GiB=$(echo | awk -v size=$HUGE_PAGE_SIZE_KB -v total=$HUGE_PAGES_TOTAL '{print (size * total) / 1024 / 1024}')

ADDITIONAL_HUGE_PAGES=$((1024 * 1024 / HUGE_PAGE_SIZE_KB)) # 1GiB in terms of HugePages
REQUIRED_HUGE_PAGES=$(( (SHARED_BUFFERS_SIZE_GB * 1024 * 1024 / HUGE_PAGE_SIZE_KB) + ADDITIONAL_HUGE_PAGES ))

echo ""
echo "Current HugePages size in GiB: ${HUGE_PAGES_GiB}"
echo "Required \"vm.nr_hugepages\" option value (for shared_buffers ${SHARED_BUFFERS_SIZE_GB} GiB with additional 1 GiB): ${REQUIRED_HUGE_PAGES}"
echo ""

Allocate a little more (+1GB) "vm.nr_hugepages" than the size of shared_buffers

iabdukhoshimov commented 1 month ago

Thanks it worked