NOTE - This guide was used to migrate a working non-paritioned Zabbix instance to a new instance with partitioned tables ready to ingest the data into the proper partitions. A section was added to address how to set this up for a Set up Zabbix postgreSQL partitioning on a fresh install
history
, history_uint
, history_str
, history_text
, history_log
trends
, trends_uint
history
- table that stores all numeric (float) valueshistory_uint
- table that stores all integer valueshistory_log
- table that stores all log valueshistory_text
- table that store all text valueshistory_str
- table that stores all string valuestrends
- table that stores all numeric (float) valuestrends_uint
- table that stores all numeric (unsigned integers)History and trends are the two ways of storing collected data in Zabbix.
Whereas history keeps each collected value, trends keep averaged information on hourly basis and therefore are less resource-hungry.
The general strong advice is to keep history for the smallest possible number of days and that way not to overload the database with lots of historical values.
Instead of keeping a long history, you can keep longer data of trends. For example, you could keep history for 14 days and trends for 5 years.
While keeping shorter history, you will still be able to review older data in graphs, as graphs will use trend values for displaying older data.
Trends is a built-in historical data reduction mechanism which stores minimum, maximum, average and the total number of values per every hour for numeric data types.
Trends usually can be kept for much longer than history. Any older data will be removed by the housekeeper.
When server flushes trend cache and there are already trends in the database for this hour (for example, server has been restarted mid-hour), server needs to use update statements instead of simple inserts. Therefore on a bigger installation if restart is needed it is desirable to stop server in the end of one hour and start in the beginning of the next hour to avoid trend data overlap.
Zabbix updates trends immediately after receipt of new value. Therefore, all information stored in trends is always valid and up-to-date (updated in realtime).
Zabbix generates all graphs from detailed history if period is less than 24 hours, and the trends are used for graphs having period longer than 24 hours.
Before performing partitioning in Zabbix, several aspects must be considered:
zabbix
user to the zabbix
database via the following command $ sudo -u zabbix psql zabbix
):CREATE TRIGGER housekeeper_blackhole
BEFORE INSERT ON housekeeper
FOR EACH ROW
EXECUTE PROCEDURE housekeeper_blackhole();
With the following procedure:
CREATE OR REPLACE FUNCTION housekeeper_blackhole()
RETURNS trigger AS
$func$
BEGIN
RETURN NULL;
END
$func$ LANGUAGE plpgsql;
Partitioning syntax was introduced in PostgreSQL 10. It is very effective for INSERTs and large/slow SELECT
queries, which makes it suitable for time series logging.
With PostgreSQL version 11 it is possible to create a "default" partition. This stores rows that do not fall into any existing partition's range. This is ideal since the partitioned range might not include specific data which the default will then pick up. This is automatically done with pg_partman
. From there one can delete all data from the table via the following example:
DELETE FROM public.history_p2018_11
With PostgreSQL 9.5 a new type of index, BRIN (Block Range INdex) was introduced. These indexes work best when the data on disk is sorted. Brin only stores min/max values for a range of blocks on disk, which allows them to be small, but which raises the cost for any lookup against the index. Each lookup that hits the index must read a range of pages from disk, so each hit becomes more expensive.
Huge tables benefit from the BRIN index. Adding a BRIN index is fast and very easy and works well for the use case of time series data logging, though less well under intensive update. An INSERTs into BRIN indexes are specifically designed to not slow down as the table get bigger, so they perform much better than btree indexes.
In PostgreSQL v11, partitioning offers automatic index creation. You simply create an index on the parent table, and Postgres will automatically create indexes on all child tables. This thus makes partition maintenance much easier!
postgresql.conf
to ensure enable_partition_pruning = on
. The default should be on
. This enables or disables the query planner's ability to eliminate a partitioned table's partitions from query plans, thus improving performance.pg_hba.conf
from the old database to the new database and tune the database appropriately. You can use the following tools:
Install PostgreSQL using the following commands on a Debian OS distro.
$ wget -q https://www.postgresql.org/media/keys/ACCC4CF8.asc -O - | sudo apt-key add -
$ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" >> /etc/apt/sources.list.d/pgdg.list'
$ sudo apt update
$ sudo apt install postgresql-11
Create a database user with permissions to create database objects. Ensure the created user (zabbix
) is the owner of the zabbix
database (-O zabbix
).
$ sudo -u postgres createuser --pwprompt zabbix
$ sudo -u postgres createdb -O zabbix zabbix
Do not import the initial schema as creating the partitions will cause issues.
On an empty zabbix
database (you can create multiple database on the same server if you'd like or want to upgrade from version 9.x to 10/11) create the following tables for history*
and trends*
.
First connecto the zabbix
database as the postgres
user with the following command
$ sudo -u zabbix psql zabbix
Then run the following SQL commands
-- history
CREATE TABLE public.history
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
value numeric(16,4) NOT NULL DEFAULT 0.0000,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_1 ON public.history USING btree (itemid, clock);
-- history_log
CREATE TABLE public.history_log
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
"timestamp" integer NOT NULL DEFAULT 0,
source character varying(64) COLLATE pg_catalog."default" NOT NULL DEFAULT ''::character varying,
severity integer NOT NULL DEFAULT 0,
value text COLLATE pg_catalog."default" NOT NULL DEFAULT ''::text,
logeventid integer NOT NULL DEFAULT 0,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_log_1 ON public.history_log USING btree (itemid, clock);
-- history_str
CREATE TABLE public.history_str
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
value character varying(255) COLLATE pg_catalog."default" NOT NULL DEFAULT ''::character varying,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_str_1 ON public.history_str USING btree (itemid, clock);
-- history_text
CREATE TABLE public.history_text
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
value text COLLATE pg_catalog."default" NOT NULL DEFAULT ''::text,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_text_1 ON public.history_text USING btree (itemid, clock);
-- history_uint
CREATE TABLE public.history_uint
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
value numeric(20,0) NOT NULL DEFAULT (0)::numeric,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_uint_1 ON public.history_uint USING btree (itemid, clock);
-- trends
CREATE TABLE public.trends
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
num integer NOT NULL DEFAULT 0,
value_min numeric(16,4) NOT NULL DEFAULT 0.0000,
value_avg numeric(16,4) NOT NULL DEFAULT 0.0000,
value_max numeric(16,4) NOT NULL DEFAULT 0.0000,
CONSTRAINT trends_pkey PRIMARY KEY (itemid, clock)
) PARTITION BY RANGE (clock);
-- trends_uint
CREATE TABLE public.trends_uint
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
num integer NOT NULL DEFAULT 0,
value_min numeric(20,0) NOT NULL DEFAULT (0)::numeric,
value_avg numeric(20,0) NOT NULL DEFAULT (0)::numeric,
value_max numeric(20,0) NOT NULL DEFAULT (0)::numeric,
CONSTRAINT trends_uint_pkey PRIMARY KEY (itemid, clock)
) PARTITION BY RANGE (clock);
pg_partman is an extensions to create and manage both time-based and serial-based table partition sets. Native partitioning in PostgreSQL 10 is supported as of pg_partman v3.0.1 and PostgreSQL 11 as of pg_partman v4.0.0.
pg_partman works as an extension and it can be installed directly on top of PostgreSQL.
Debian apt:
$ sudo apt install postgresql-11-partman
Create the following configuration file in the proper postgresql directory:
$ vim /etc/postgresql/11/main/conf.d/pgpartman.conf
Add the following information there:
### General
shared_preload_libraries = 'pg_partman_bgw' # (this change requires restart)
### Partitioning & pg_partman settings
enable_partition_pruning = on
pg_partman_bgw.interval = 3600
pg_partman_bgw.role = 'zabbix'
pg_partman_bgw.dbname = 'zabbix'
pg_partman_bgw.analyze = off
pg_partman_bgw.jobmon = on
Restart postgresql (sudo systemctl restart postgresql.service
) and in the logs you should see:
pg_partman master background worker master process initialized with role zabbix
Connect to the zabbix
database and create the pg_partman
extension as part of the public
schema on the zabbix
database. The following commands should be run by a superuser of the database cluster (i.e. 'postgres' is used in this example, suitable for Debian/Ubuntu).
$ sudo -u postgres psql zabbix
Then create the SCHEMA and EXTENSION:
CREATE SCHEMA partman;
CREATE EXTENSION pg_partman schema partman;
Ensure the zabbix
database user can execute all functions, procedures on the partman
SCHEMA with the following SQL commands.
GRANT ALL ON SCHEMA partman TO zabbix;
GRANT ALL ON ALL TABLES IN SCHEMA partman TO zabbix;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO zabbix;
GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO zabbix;
Perform the SQL queries on the zabbix
database as the zabbix
user (otherwise it won't find the functions in another database where the extension was added) which will target the pg_partman functions uploaded:
First connect to the zabbix
database as the zabbix
user:
$ sudo -u zabbix psql zabbix
Then run the following SQL commands:
SELECT partman.create_parent('public.history', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_uint', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_str', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_text', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_log', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.trends', 'clock', 'native', 'monthly', null, 12, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.trends_uint', 'clock', 'native', 'monthly', null, 12, 'on', null, true, 'seconds');
You should verify that the tables have ownership zabbix
:
\d
Example output:
List of relations
Schema | Name | Type | Owner
--------+----------------------------+----------+--------
public | history | table | zabbix
public | history_default | table | zabbix
public | history_log | table | zabbix
public | history_log_default | table | zabbix
public | history_log_p2019_08_24 | table | zabbix
public | history_log_p2019_08_25 | table | zabbix
public | history_log_p2019_08_26 | table | zabbix
public | history_log_p2019_08_27 | table | zabbix
public | history_log_p2019_08_28 | table | zabbix
public | history_log_p2019_08_29 | table | zabbix
public | history_log_p2019_08_30 | table | zabbix
public | history_log_p2019_08_31 | table | zabbix
public | history_log_p2019_09_01 | table | zabbix
public | history_log_p2019_09_02 | table | zabbix
public | history_log_p2019_09_03 | table | zabbix
public | history_log_p2019_09_04 | table | zabbix
public | history_log_p2019_09_05 | table | zabbix
public | history_log_p2019_09_06 | table | zabbix
public | history_log_p2019_09_07 | table | zabbix
public | history_p2019_08_24 | table | zabbix
public | history_p2019_08_25 | table | zabbix
public | history_p2019_08_26 | table | zabbix
public | history_p2019_08_27 | table | zabbix
public | history_p2019_08_28 | table | zabbix
public | history_p2019_08_29 | table | zabbix
public | history_p2019_08_30 | table | zabbix
public | history_p2019_08_31 | table | zabbix
public | history_p2019_09_01 | table | zabbix
public | history_p2019_09_02 | table | zabbix
public | history_p2019_09_03 | table | zabbix
public | history_p2019_09_04 | table | zabbix
public | history_p2019_09_05 | table | zabbix
public | history_p2019_09_06 | table | zabbix
public | history_p2019_09_07 | table | zabbix
public | history_str | table | zabbix
public | history_str_default | table | zabbix
public | history_str_p2019_08_24 | table | zabbix
public | history_str_p2019_08_25 | table | zabbix
public | history_str_p2019_08_26 | table | zabbix
public | history_str_p2019_08_27 | table | zabbix
public | history_str_p2019_08_28 | table | zabbix
public | history_str_p2019_08_29 | table | zabbix
public | history_str_p2019_08_30 | table | zabbix
public | history_str_p2019_08_31 | table | zabbix
public | history_str_p2019_09_01 | table | zabbix
public | history_str_p2019_09_02 | table | zabbix
public | history_str_p2019_09_03 | table | zabbix
public | history_str_p2019_09_04 | table | zabbix
public | history_str_p2019_09_05 | table | zabbix
public | history_str_p2019_09_06 | table | zabbix
public | history_str_p2019_09_07 | table | zabbix
....
....
....
It is impossible to manually remove partitions. Instead UPDATE
the table partman.part_config
table config.
Connect via zabbix
user to the zabbix
database:
$ sudo -u zabbix psql zabbix
Update the partman.part_config
table:
UPDATE partman.part_config set retention = '30 day', retention_keep_table = false, retention_keep_index = false WHERE parent_table = 'public.history';
Then execute maintenance procedure:
SELECT partman.run_maintenance('public.history');
pg_partman has a function run_maintenance
that allows one to automate the table maintenance.
SELECT partman.run_maintenance(p_analyze := false);
-- note: disabling analyze is recommended for native partitioning due to aggressive locks
Native partitioning can result in heavy locking and therefore it is recommended to set p_analyze to FALSE which will effectively disable analyze.
To configure pg_partman to drop old partitions, update the partman.part_config
tables connected as the zabbix
user:
$ sudo -u zabbix psql zabbix
Then run the following commands:
UPDATE partman.part_config SET retention_keep_table = false, retention = '7 day'
WHERE parent_table = 'public.history';
UPDATE partman.part_config SET retention_keep_table = false, retention = '7 day'
WHERE parent_table = 'public.history_uint';
UPDATE partman.part_config SET retention_keep_table = false, retention = '7 day'
WHERE parent_table = 'public.history_str';
UPDATE partman.part_config SET retention_keep_table = false, retention = '7 day'
WHERE parent_table = 'public.history_text';
UPDATE partman.part_config SET retention_keep_table = false, retention = '7 day'
WHERE parent_table = 'public.history_log';
UPDATE partman.part_config SET retention_keep_table = false, retention = '12 month'
WHERE parent_table = 'public.trends';
UPDATE partman.part_config SET retention_keep_table = false, retention = '12 month'
WHERE parent_table = 'public.trends_uint';
Following this change run the maintenance actively via SQL:
SELECT partman.run_maintenance('public.history');
SELECT partman.run_maintenance('public.history_uint');
SELECT partman.run_maintenance('public.history_str');
SELECT partman.run_maintenance('public.history_text');
SELECT partman.run_maintenance('public.history_log');
SELECT partman.run_maintenance('public.trends');
SELECT partman.run_maintenance('public.trends_uint');
To use partitioning on a fresh install on the following specifications (as an example):
defaults/main.yml
file.$ cd /tmp
$ wget https://repo.zabbix.com/zabbix/4.0/debian/pool/main/z/zabbix-release/zabbix-release_4.0-3+stretch_all.deb
$ sudo dpkg -i zabbix-release_4.0-3+stretch_all.deb
$ sudo apt update
$ sudo apt-get install --no-install-recommends zabbix-server-pgsql
# Now import the schema. You will get errors stating some tables and indexes already exist but ignore those as they were purposefully set on the partitioning portion.
$ zcat /usr/share/doc/zabbix-server-pgsql*/create.sql.gz | sudo -u zabbix psql zabbix
Run the following sql query in the zabbix
DB as the zabbix
user:
zabbix=# \dt partman.*
Ensure that the zabbix
user is the owner of all tables in the zabbix
database.
You can use the change-zbxdb-table-ownership.sh shell script to change the ownership of all zabbix tables easily.
$ sudo chmod +x /var/lib/postgresql/change-zbxdb-table-ownership.sh
$ sudo -u postgres /var/lib/postgresql/change-zbxdb-table-ownership.sh zabbix zabbix
You should, generally speaking not retain history
table data longer than a few days as the zabbix build-in historical data mechanism then kicks in; aka trends!
This is also suggested in the zabbix documentation. The history
data (numeric float in this case) should be keep for the smallest possible number of days. I started using 7 days and find it plenty for my needs.
I would, for example, recommend to delete history
data older than 14 days as a good starting point then tune your partitions as you see fit; there is no real need to store numeric (float) data longer than a few days as trends takes over. That way when you pg_dump
and pg_restore
the data it won't take a long time to filtrate into the appropriate partitions.
delete FROM history where age(to_timestamp(history.clock)) > interval '14 days';
This thus makes the pg_restore
using the COPY
method much faster.
If you do intend to keep the history
for a very long time you could disable the partman automatic maintenance via the following command:
UPDATE partman.part_config SET automatic_maintenance = 'off' WHERE parent_table = 'public.history';
This is because importing a significant amount of data into the history
partitioned table will take a very long time and running maintenance on that table will not work. Wait until the table is fully finished before turning on the maintenance for that table.
This was reported in issue 5
PRIOR
zabbix
user ($ sudo -u zabbix psql zabbix
).-- view connections
SELECT sum(numbackends) FROM pg_stat_database;
-- kill all connections
SELECT pg_terminate_backend(pg_stat_activity.pid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = 'zabbix'
AND pid <> pg_backend_pid();
PROCEDURE
Objective: changing the following example table (plus 4 other tables ~10GGB each) from monthly to daily partitions while minimizing downtime and data loss.
zabbix=# \d+ history
Table "public.history"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------------+-----------+----------+---------+---------+--------------+-------------
itemid | bigint | | not null | | plain | |
clock | integer | | not null | 0 | plain | |
value | numeric(16,4) | | not null | 0.0000 | main | |
ns | integer | | not null | 0 | plain | |
Partition key: RANGE (clock)
Indexes:
"history_1" btree (itemid, clock)
Partitions: history_p2019_01 FOR VALUES FROM (1546300800) TO (1548979200),
history_p2019_02 FOR VALUES FROM (1548979200) TO (1551398400),
history_p2019_03 FOR VALUES FROM (1551398400) TO (1554076800),
history_p2019_04 FOR VALUES FROM (1554076800) TO (1556668800),
history_p2019_05 FOR VALUES FROM (1556668800) TO (1559347200),
history_p2019_06 FOR VALUES FROM (1559347200) TO (1561939200),
history_p2019_07 FOR VALUES FROM (1561939200) TO (1564617600),
history_default DEFAULT
Procedure:
/* The following is just an example of an epoch timestamp 7 days out from initiall creating this procedure - https://www.epochconverter.com/ ... */
delete FROM history where age(to_timestamp(history.clock)) > interval '7 days';
delete FROM history_uint where age(to_timestamp(history_uint.clock)) > interval '7 days' ;
delete FROM history_str where age(to_timestamp(history_str.clock)) > interval '7 days' ;
delete FROM history_log where age(to_timestamp(history_log.clock)) > interval '7 days' ;
delete FROM history_text where age(to_timestamp(history_text.clock)) > interval '7 days' ;
history*
tables in the partman.part_config
column used by the maintenanceUPDATE partman.part_config SET automatic_maintenance = 'off' WHERE parent_table = 'public.history';
UPDATE partman.part_config SET automatic_maintenance = 'off' WHERE parent_table = 'public.history_uint';
UPDATE partman.part_config SET automatic_maintenance = 'off' WHERE parent_table = 'public.history_str';
UPDATE partman.part_config SET automatic_maintenance = 'off' WHERE parent_table = 'public.history_log';
UPDATE partman.part_config SET automatic_maintenance = 'off' WHERE parent_table = 'public.history_text';
-- history_moved
CREATE TABLE public.history_moved
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
value numeric(16,4) NOT NULL DEFAULT 0.0000,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_moved_1 ON public.history_moved USING BRIN (itemid, clock);
-- history_uint_moved
CREATE TABLE public.history_uint_moved
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
value numeric(20,0) NOT NULL DEFAULT (0)::numeric,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_uint_moved_1 ON public.history_uint_moved USING BRIN (itemid, clock);
-- history_str_moved
CREATE TABLE public.history_str_moved
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
value character varying(255) COLLATE pg_catalog."default" NOT NULL DEFAULT ''::character varying,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_str_moved_1 ON public.history_str_moved USING BRIN (itemid, clock);
-- history_log_moved
CREATE TABLE public.history_log_moved
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
"timestamp" integer NOT NULL DEFAULT 0,
source character varying(64) COLLATE pg_catalog."default" NOT NULL DEFAULT ''::character varying,
severity integer NOT NULL DEFAULT 0,
value text COLLATE pg_catalog."default" NOT NULL DEFAULT ''::text,
logeventid integer NOT NULL DEFAULT 0,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_log_moved_1 ON public.history_log_moved USING BRIN (itemid, clock);
-- history_text_moved
CREATE TABLE public.history_text_moved
(
itemid bigint NOT NULL,
clock integer NOT NULL DEFAULT 0,
value text COLLATE pg_catalog."default" NOT NULL DEFAULT ''::text,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);
CREATE INDEX history_text_moved_1 ON public.history_text_moved USING BRIN (itemid, clock);
Then run
SELECT partman.create_parent('public.history_moved', 'clock', 'native', 'monthly', null, 1, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_uint_moved', 'clock', 'native', 'monthly', null, 1, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_str_moved', 'clock', 'native', 'monthly', null, 1, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_log_moved', 'clock', 'native', 'monthly', null, 1, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_text_moved', 'clock', 'native', 'monthly', null, 1, 'on', null, true, 'seconds');
partman.undo_partition_proc()
function on the table wanting to be unpartitioned. This will insert the data into the newly created tables in the previous steps. This seems to lock the table and you can't view any information in the frontend (hence a reason why the frontend should be stopped from writing to the DB):CALL partman.undo_partition_proc('public.history', '1 day', null, 1, 'public.history_moved', false, 0, 10, false);
CALL partman.undo_partition_proc('public.history_uint', '1 day', null, 3, 'public.history_uint_moved', false, 0, 10, false);
CALL partman.undo_partition_proc('public.history_str', '1 day', null, 3, 'public.history_str_moved', false, 0, 10, false);
CALL partman.undo_partition_proc('public.history_log', '1 day', null, 3, 'public.history_log_moved', false, 0, 10, false);
CALL partman.undo_partition_proc('public.history_text', '1 day', null, 3, 'public.history_text_moved', false, 0, 10, false);
VACUUM ANALYZE history;
VACUUM ANALYZE history_uint;
VACUUM ANALYZE history_str;
VACUUM ANALYZE history_log;
VACUUM ANALYZE history_text;
history*
tables wanting daily partitions:SELECT partman.create_parent('public.history', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_uint', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_str', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_log', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
SELECT partman.create_parent('public.history_text', 'clock', 'native', 'daily', null, 7, 'on', null, true, 'seconds');
INSERT INTO public.history SELECT * FROM public.history_moved WHERE clock > 1549168074;
INSERT INTO public.history_uint SELECT * FROM public.history_uint_moved WHERE clock > 1549168074;
INSERT INTO public.history_str SELECT * FROM public.history_str_moved WHERE clock > 1549168074;
INSERT INTO public.history_log SELECT * FROM public.history_log_moved WHERE clock > 1549168074;
INSERT INTO public.history_text SELECT * FROM public.history_text_moved WHERE clock > 1549168074;
partman.part_config
columnDROP TABLE history_moved;
DELETE FROM partman.part_config WHERE parent_table = 'public.history_moved';
DROP TABLE history_uint_moved;
DELETE FROM partman.part_config WHERE parent_table = 'public.history_uint_moved';
DROP TABLE history_str_moved;
DELETE FROM partman.part_config WHERE parent_table = 'public.history_str_moved';
DROP TABLE history_log_moved;
DELETE FROM partman.part_config WHERE parent_table = 'public.history_log_moved';
DROP TABLE history_text_moved;
DELETE FROM partman.part_config WHERE parent_table = 'public.history_text_moved';
partman.part_config
for public.history
:UPDATE partman.part_config SET automatic_maintenance = 'on', retention_keep_table = false, retention = '8 day' WHERE parent_table = 'public.history';
UPDATE partman.part_config SET automatic_maintenance = 'on', retention_keep_table = false, retention = '8 day' WHERE parent_table = 'public.history_uint';
UPDATE partman.part_config SET automatic_maintenance = 'on', retention_keep_table = false, retention = '8 day' WHERE parent_table = 'public.history_str';
UPDATE partman.part_config SET automatic_maintenance = 'on', retention_keep_table = false, retention = '8 day' WHERE parent_table = 'public.history_log';
UPDATE partman.part_config SET automatic_maintenance = 'on', retention_keep_table = false, retention = '8 day' WHERE parent_table = 'public.history_text';
SELECT partman.run_maintenance('public.history');
SELECT partman.run_maintenance('public.history_uint');
SELECT partman.run_maintenance('public.history_str');
SELECT partman.run_maintenance('public.history_log');
SELECT partman.run_maintenance('public.history_text');
partman.part_config
:SELECT * FROM partman.part_config;
VACUUM ANALYZE history;
VACUUM ANALYZE history_uint;
VACUUM ANALYZE history_str;
VACUUM ANALYZE history_log;
VACUUM ANALYZE history_text;
Ensure pg_partman_bgw
is set in postgresql.conf
file.
On the old database, ensure pg_hba.conf
file is set to allow connections from the new database.
# Note that the following commands are run on the new DB instance...
$ sudo mkdir -p /var/backups/postgresql
$ sudo chown -R postgres:postgres /var/backups/postgresql
$ sudo -u postgres pg_dump -Fc --file=/var/backups/postgresql/zabbix.dump -d zabbix -h <OLD-`zabbix`-DATABASE> -U zabbix
$ sudo -u postgres pg_restore -j 8 -d zabbix /var/backups/postgresql/zabbix.dump -U zabbix
The following scenario addresses how to migrate from Zabbix version 3.4 to version 4.2 while also moving from PostgreSQL version 9.6 to version 11.
Requirements
Prepare the new PostgreSQL instance for the Zabbix 4.2 database as described in the previous section, as well as follow the steps for installing the pg_partman extension.
On the old database, ensure pg_hba.conf
file is set to allow connections from the new database.
# Note that the following commands are run on the new DB instance...
$ sudo mkdir -p /var/backups/postgresql
$ sudo chown -R postgres:postgres /var/backups/postgresql
$ sudo -u postgres pg_dump -Fd -j 4 -d zabbix -h <OLD-`zabbix`-DATABASE> -U zabbix --inserts -Z 4 --file=/var/backups/postgresql/zabbix-configuration-<DATE> --exclude-table=history* --exclude-table=trends*
$ sudo -u postgres pg_restore -Fd -j 4 -d zabbix -h 127.0.0.1 -U zabbix /var/backups/postgresql/zabbix-configuration-<DATE>
pg_dump
option -j 4
.-Fd
helps restore data in parallel which is also faster.At this point you will have all tables except history*
and trends*
on the new Zabbix instance.
You can then prepare the missing history*
and trends*
tables by following the "create partitioned tables" section outlined above.
Once you finish the procedure of importing the configuration data from the old instance to the new instance AND you created the partitioned tables you can point the Zabbix server to the new DB instance.
Thus this minimizes downtime of the Zabbix environment as we can import the older history*
and trends*
data later while Zabbix is actively running.
Once you start the Zabbix Server you should see the following lines in the zabbix_server.log
file verifying that the upgrade has completed successfully.
6330:20190512:020451.749 using configuration file: /etc/zabbix/zabbix_server.conf
6330:20190512:020451.862 current database version (mandatory/optional): 03040000/03040007
6330:20190512:020451.862 required mandatory version: 04020000
6330:20190512:020451.862 starting automatic database upgrade
6330:20190512:020451.880 completed 0% of database upgrade
6330:20190512:020451.897 completed 1% of database upgrade
6330:20190512:020451.932 completed 2% of database upgrade
...
6364:20190512:021416.145 completed 99% of event name update
6364:20190512:021416.322 completed 100% of event name update
6364:20190512:021416.555 event name update completed
6364:20190512:021416.567 server #0 started [main process]
...
In order to migrate the old data from history*
and trends*
data with as minimal time as possible perform the following pg_dump/restore
command on the new DB instance:
Using the --inserts
method for history*
tables:
# Note that the following commands are run on the new DB instance...
$ sudo mkdir -p /var/backups/postgresql
$ sudo chown -R postgres:postgres /var/backups/postgresql
$ sudo -u postgres pg_dump -Fd -j 4 -d zabbix -h <OLD-`zabbix`-DATABASE> -U zabbix --inserts -Z 4 --file=/var/backups/postgresql/zabbix-history-tables-<DATE> --table=history*
$ sudo -u postgres pg_restore -Fd -j 4 -d zabbix -h 127.0.0.1 -U zabbix /var/backups/postgresql/zabbix-history-tables-<DATE>
You can use the default COPY
procedure (not using the --inserts
option...), as there are no constraints on the tables. Thus it is allowed to insert all the data, even the duplicate ones, to the history*
tables. Those will eventually be dropped via partitioning.
Using the COPY
method for history*
tables:
# Note that the following commands are run on the new DB instance...
$ sudo mkdir -p /var/backups/postgresql
$ sudo chown -R postgres:postgres /var/backups/postgresql
$ sudo -u postgres pg_dump -Fc --file=/var/backups/postgresql/zabbix-history-tables-<DATE>.dump -d zabbix -h <OLD-`zabbix`-DATABASE> -U zabbix --table=history*
$ sudo -u postgres pg_restore -j 8 -d zabbix /var/backups/postgresql/zabbix-history-tables-<DATE>.dump -U zabbix
Using the --inserts
method for trends*
tables:
# Note that the following commands are run on the new DB instance...
$ sudo mkdir -p /var/backups/postgresql
$ sudo chown -R postgres:postgres /var/backups/postgresql
$ sudo -u postgres pg_dump -Fd -j 4 -d zabbix -h <OLD-`zabbix`-DATABASE> -U zabbix --inserts -Z 4 --file=/var/backups/postgresql/zabbix-trends-tables-<DATE> --table=trends*
$ sudo -u postgres pg_restore -Fd -j 4 -d zabbix -h 127.0.0.1 -U zabbix /var/backups/postgresql/zabbix-trends-tables-<DATE>
Using the COPY
method for trends*
tables:
# Note that the following commands are run on the new DB instance...
$ sudo mkdir -p /var/backups/postgresql
$ sudo chown -R postgres:postgres /var/backups/postgresql
$ sudo -u postgres pg_dump -Fc --file=/var/backups/postgresql/zabbix-trends-tables-<DATE>.dump -d zabbix -h <OLD-`zabbix`-DATABASE> -U zabbix --table=trends*
$ sudo -u postgres pg_restore -j 8 -d zabbix /var/backups/postgresql/zabbix-trends-tables-<DATE>.dump -U zabbix
For large Zabbix databases the amount of data to restore could be exponential (Terabytes worth of trends* data) and take a very long time (hours). A use case mentioned by someone else was to change the parameter in the postgresql.conf
file fsync = off
(:scream:) (this just requires a reload of postgresql and not a restart of the cluster). This is very, very risky as turning this off can cause unrecoverable data corruption. As an end result turning off fsync helped 49 million records to restore within 3 hours.
:exclamation: As always please test this properly before implementing this procedure in a production instance.
pgbench
help understand Transaction Processing Performance Council (TPC) on a database to see performance.
-i
= initialize-s
= scaling-j
= threads-c
= clients-t
= transactions; based on clients; if 10 clients then 100,000 transactionsExample:
$ pgbench -i -s 50 zabbix
$ pgbench -c 10 -j 2 -t 10000 zabbix
To help understand what is being done under the hood with partitioning we can use the EXPLAIN
command. The EXPLAIN
command shows the execution plan of a statement and how the tables i scanned.
$ psql -d zabbix -c "EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT count(*) FROM public.history;"
$ psql -d zabbix -c "EXPLAIN ANALYZE SELECT * FROM public.history;"
PSQL 11 Non-Partitioned Database VM
$ pgbench -i -s 50 zabbix
$ pgbench -c 10 -j 2 -t 10000 zabbix
Result:
$ pgbench -c 10 -j 2 -t 10000 zabbix
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 10
number of threads: 2
number of transactions per client: 10000
number of transactions actually processed: 100000/100000
latency average = 6.900 ms
tps = 1449.297191 (including connections establishing)
tps = 1449.363391 (excluding connections establishing)
PSQL 11 Non-Partitioned Database Baremetal
$ pgbench -c 10 -j 2 -t 10000 zabbix
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 10
number of threads: 2
number of transactions per client: 10000
number of transactions actually processed: 100000/100000
latency average = 8.235 ms
tps = 1214.341276 (including connections establishing)
tps = 1214.421294 (excluding connections establishing)
PSQL 11 Partitioned Database VM
$ pgbench -i -s 50 zabbix
$ pgbench -c 10 -j 2 -t 10000 zabbix
$ pgbench -c 10 -j 2 -t 10000 zabbix
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 10
number of threads: 2
number of transactions per client: 10000
number of transactions actually processed: 100000/100000
latency average = 5.605 ms
tps = 1783.971153 (including connections establishing)
tps = 1784.074779 (excluding connections establishing)
DROP
an individual partition to erase all of the data from that range.REINDEX
operations will happen in a fraction of a time it would take for a single giant index to build.enable_partition_pruning = on
.WHERE
clause needs to filter on constants. In general, keep the WHERE
clauses as simple as possible, to improve the odds the optimizer will construct the exclusion proof you're looking for.VACUUM/ANALYZE
, these will not cascade from the parent. You need to specifically target each partition with those operations.INSERT
trigger. Expect the bad data will show up one day with a timestamp either far in the past or in the future, relative to what you have partitioned right now. Instead of throwing an error, some prefer to redirect inserted rows from outside of the partitioned range into a holding pen partition dedicated to suspicious data.Reference - PostgreSQL 10 High Performance; 2018; pg 426-427
I have written an ansible role that I used to configure pg_partman. I used the role in conjunction with the excellent ANXS postgresql role.
For details please view ansible|zabbix.pgpartman.
pg_partman
)