Partman generate garbage partitions with Zabbix

gyrter commented 4 years ago

Hello! I've been using your extension since 2018 and I'm really liking it.

But I got little annoying bug with Zabbix databases. Sometimes Zabbix inserts row with the timestamp set in future and this goes to default table. For example:

zabbix=# SELECT itemid, to_timestamp(clock) as clock FROM history_uint_default WHERE clock >= 1986587004;
 itemid  |   clock     
---------+------------------------
 1085916 | 2032-12-14 00:43:25+03
 1085929 | 2032-12-14 00:43:25+03

I'm watching this table to monitor the appearance of such erroneous data. But run_maintenance_proc procedure is trying to create garbage partitions up to December 14 2032. It causes database lock and great alarm storm from Zabbix.

I tried to turn on infinite_time_partitions setting, but it did not cause any effect.

I want only partitions from default interval - 60 in past, 30 in future and default partition. Is it possible to fix this behavior?

keithf4 commented 4 years ago

Unfortunately it can be really hard to handle things cleanly if garbage data is getting inserted. The ideal and long term fix is to find out why Zabbix is creating bad data and fix that. The creation of new partitions has to be based around a sound data model to work at its best.

infinite_time_partitions should help somewhat in that it should cause automatic maintenance to only make new partitions based on "now" vs the data that actually exists in the partition set. It should only make the "garbage" partitions if you are manually running one of the partition_data*() functions to move that data out of the default and make a real child table based on it. If automatic maintenance is making new partitions based just on that new data going into the default, and you do indeed have inifinte_time_partitions set to true for that partition set, something is else is very wrong (bad time sync on the server?). If that's the case, can you share your schema and the contents of your part_config table and I'll try and recreate the problem?

Until you can fix what is causing the "bad" data to get inserted, you really have to ensure that it is getting removed in a timely manner. You can create a monitor using the check_default() function to at least alert you immediately when any data goes into the default partition. If you don't care about getting the exact count of rows for the monitor, you can set the p_exact_count parameter to that function to false and it should return relatively quickly.

gyrter commented 4 years ago

Thank you for a fast response. Here is my sql schema

CREATE TABLE public.history (
    itemid bigint NOT NULL,
    clock integer DEFAULT 0 NOT NULL,
    value numeric(16,4) DEFAULT 0.0000 NOT NULL,
    ns integer DEFAULT 0 NOT NULL
) PARTITION BY RANGE (clock);
ALTER TABLE public.history OWNER TO zabbix;
CREATE TABLE public.history_uint (
    itemid bigint NOT NULL,
    clock integer DEFAULT 0 NOT NULL,
    value numeric(20,0) DEFAULT (0)::numeric NOT NULL,
    ns integer DEFAULT 0 NOT NULL
) PARTITION BY RANGE (clock);
ALTER TABLE public.history_uint OWNER TO zabbix;
CREATE INDEX history_timestamp_idx ON history (to_timestamp(clock));
CREATE INDEX history_uint_timestamp_idx ON history (to_timestamp(clock));
CREATE INDEX history_uint_1 ON history_uint (itemid,clock);
CREATE INDEX history_1 ON history (itemid,clock);
SELECT partman.create_parent('public.history', 'clock', 'native', 'daily', p_premake := 30 ,p_start_partition := '2019-08-01', p_epoch := 'seconds');
SELECT partman.create_parent('public.history_uint', 'clock', 'native', 'daily', p_premake := 30 , p_start_partition := '2019-08-01', p_epoch := 'seconds');
UPDATE partman.part_config SET retention = '60 days', retention_keep_table = false WHERE parent_table = 'public.history';
UPDATE partman.part_config SET retention = '60 days', retention_keep_table = false WHERE parent_table = 'public.history_uint';

And here is dump of part_config

COPY partman.part_config  FROM stdin;
public.history_uint clock   native  1 day   \N  30  4   30  seconds t   60 days \N  f   t   t   YYYY_MM_DD  on  t   f   f   f       t   partman.template_public_history_uint    \N  f   t
public.history  clock   native  1 day   \N  30  4   30  seconds t   60 days \N  f   t   t   YYYY_MM_DD  on  t   f   f   f       t   partman.template_public_history \N  f   t
\.

keithf4 commented 4 years ago

Ok. I am seeing it try to create all the new partitions based on the "bad" data that was inserted. So I was mistaken on inifinite_time_partitions being a work around. Will have to look into why, but honestly, I think that's likely going to be the expected behavior.

However, unless you've actually run the partition_data*() function to move that bad data out of the default, it should be throwing errors as seen below. Because you cannot create a child table with a range of any data that exists in the default. The partition_data*() functions handle this for native partitioning by first moving it from the default to a temporary table, then creating the child table, then moving it back to the proper child table.

So, yes if you get data that far ahead in time, it's going to cause a long lock as it's trying to create all the intervening child tables. But it will likely fail at the end of that.

So again, the proper fix for this is to try and stop whatever is causing the bad data. Another thing that could possibly be a workable fix for your situation would be to drop the default partition. If you know you don't want any data to be inserted outside of the existing child tables, not having the default will cause any query that tries to do that to throw an error.

Otherwise, I'd suggest setting up a monitor with check_default() to try and catch whenever bad data gets inserted.


keith=# update partman.part_config set infinite_time_partitions = 'true';
UPDATE 2

keith=# insert into history_uint values (12, extract('epoch' from '2022-12-14 12:12'::timestamptz), 12, 12);
INSERT 0 1
keith=# select * from history_uint;
 itemid |   clock    | value | ns 
--------+------------+-------+----
     12 | 1671037920 |    12 | 12
(1 row)

keith=# select * from history_uint_default;
 itemid |   clock    | value | ns 
--------+------------+-------+----
     12 | 1671037920 |    12 | 12
(1 row)

keith=# select partman.run_maintenance();
ERROR:  updated partition constraint for default partition would be violated by some row
CONTEXT: SQL statement "ALTER TABLE public.history_uint ATTACH PARTITION public.history_uint_p2022_12_14 FOR VALUES FROM ('1670994000') TO ('1671080400')"
PL/pgSQL function create_partition_time(text,timestamp with time zone[],boolean,boolean) line 243 at EXECUTE
PL/pgSQL function run_maintenance(text,boolean,boolean,boolean) line 280 at assignment
DETAIL: 
HINT: 
CONTEXT: PL/pgSQL function create_partition_time(text,timestamp with time zone[],boolean,boolean) line 459 at RAISE
PL/pgSQL function run_maintenance(text,boolean,boolean,boolean) line 280 at assignment
DETAIL: 
HINT: 
CONTEXT:  PL/pgSQL function run_maintenance(text,boolean,boolean,boolean) line 398 at RAISE```

gyrter commented 4 years ago

Yes, I'm totally agree with you - fixing bad data source is the most valuable way. But I can not do it - bad data came from external programs over Zabbix trapper protocol. Database should store this data, because insert errors cause Zabbix servers unexpected behavior.

I created work-around for this problem. It is scheduled cron task. It is creating reports with item description and stored data. Also, it is removing bad data.

I'm thinking about some insert triggers for default table now. Trigger based workaround looks greater than cron task.

keithf4 commented 4 years ago

Would be curious to see your trigger-based solution when you have that done. Could be an interesting suggestion to be able to have a record of in the issues here on this project.

gyrter commented 4 years ago

I wrote little function to fix this behavior. So, here is my solution.

CREATE TABLE public.history_bad (
    itemid bigint NOT NULL,
    clock integer DEFAULT 0 NOT NULL,
    value numeric(16,4) DEFAULT 0.0000 NOT NULL,
    ns integer DEFAULT 0 NOT NULL
);
ALTER TABLE public.history OWNER TO zabbix;
CREATE INDEX history_bad_1 ON ONLY public.history_bad USING btree (itemid, clock);
CREATE INDEX history_bad_timestamp_idx ON ONLY public.history_bad USING btree (to_timestamp((clock)::double precision));

CREATE TABLE public.history_uint_bad (
    itemid bigint NOT NULL,
    clock integer DEFAULT 0 NOT NULL,
    value numeric(20,0) DEFAULT (0)::numeric NOT NULL,
    ns integer DEFAULT 0 NOT NULL
);

ALTER TABLE public.history_uint_bad OWNER TO zabbix;
CREATE INDEX history_uint_bad_1 ON ONLY public.history_uint_bad USING btree (itemid, clock);
CREATE INDEX history_uint_bad_timestamp_idx ON ONLY public.history_uint_bad USING btree (to_timestamp((clock)::double precision));

CREATE OR REPLACE FUNCTION history_clock_checker() RETURNS trigger AS $history_clock_checker$
    BEGIN
        IF to_timestamp(NEW.clock) < (NOW() - '60 day'::interval) THEN
            RAISE NOTICE 'too old row';
            INSERT INTO history_bad VALUES(NEW.*);
            RETURN NULL;
        END IF;
        IF to_timestamp(NEW.clock) > (NOW() + '30 day'::interval) THEN
            RAISE NOTICE 'too new row';
            INSERT INTO history_bad VALUES(NEW.*);
            RETURN NULL;
        END IF;
        RETURN NEW;
    END;
$history_clock_checker$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION history_uint_clock_checker() RETURNS trigger AS $history_uint_clock_checker$
    BEGIN
        IF to_timestamp(NEW.clock) < (NOW() - '60 day'::interval) THEN
            RAISE NOTICE 'too old row';
            INSERT INTO history_uint_bad VALUES(NEW.*);
            RETURN NULL;
        END IF;
        IF to_timestamp(NEW.clock) > (NOW() + '30 day'::interval) THEN
            RAISE NOTICE 'too new row';
            INSERT INTO history_uint_bad VALUES(NEW.*);
            RETURN NULL;
        END IF;
        RETURN NEW;
    END;
$history_uint_clock_checker$ LANGUAGE plpgsql;

CREATE TRIGGER history_clock_checker BEFORE INSERT ON history_default
    FOR EACH ROW EXECUTE PROCEDURE history_clock_checker();

CREATE TRIGGER history_clock_checker BEFORE INSERT ON history_uint_default
    FOR EACH ROW EXECUTE PROCEDURE history_uint_clock_checker();

May be it will help someone.

keithf4 commented 4 years ago

Thanks very much for sharing that! Seems like a great way to handle that issue cleanly.

keithf4 commented 4 years ago

Actually, I did think of one issue with doing this. The now() function is not very performant because it must be re-evaluated every time it is called and will not allow caching related statement's internal query plan. Since this is part of a trigger that is evaluated on every insert, it's preventing the insert plan from getting cached. Therefore if you had a high insert rate, you may see a loss in performance. This is the reason the old trigger-based method of pg_partman used static time values in its trigger statements and would update the trigger with new values during maintenance periods vs just using now() like you did here. The trigger did have a fallback to using now() but that was only for data outside of the expected normal window of data being inserted.

So if you're seeing any performance issues with this method, you may want to try replacing the dynamic determination of "old" and "new" data with the static date values of 60 days old and 30 days in the future compared to your "current" data insertion time window. Then have some other maintenance process that runs along with the pg_partman maintenance to update the contents of the trigger functions.

gyrter commented 4 years ago

Yes, it can cause performance issues. But this function will be called only if data will try to lay into default table. Traffic to this tables is very low.

zabbix=# explain analyze INSERT INTO history VALUES (85323, 1, 0, 0);
NOTICE:  too old row
                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 Insert on history  (cost=0.00..0.01 rows=1 width=34) (actual time=2.937..2.937 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=34) (actual time=0.001..0.002 rows=1 loops=1)
 Planning Time: 0.036 ms
 Trigger history_clock_checker on history_default: time=0.803 calls=1
 Execution Time: 3.412 ms
(5 rows)

zabbix=# ROLLBACK ;
ROLLBACK
zabbix=# BEGIN;
BEGIN
zabbix=# explain analyze INSERT INTO history VALUES (85323, 1578643823, 0, 0);
                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 Insert on history  (cost=0.00..0.01 rows=1 width=34) (actual time=0.664..0.664 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=34) (actual time=0.002..0.002 rows=1 loops=1)
 Planning Time: 0.028 ms
 Execution Time: 1.994 ms
(4 rows)

zabbix=# ROLLBACK ;
ROLLBACK

keithf4 commented 4 years ago

Going to close this issue for now, but this is a great idea to be able to refer people back to if they come across the same problem. Thanks!

pgpartman / pg_partman

Partman generate garbage partitions with Zabbix #289