2ndQuadrant / pglogical

Logical Replication extension for PostgreSQL 17, 16, 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
http://2ndquadrant.com/en/resources/pglogical/
Other
1.01k stars 153 forks source link

Colum null value when replicating #8

Closed jstuhli closed 8 years ago

jstuhli commented 8 years ago

I'm trying to replicate a not-so-large table (500,000 rows) with 30 columns, 6 of which are integer[] not null columns from 9.4 to 9.5. One of those columns usually is fairly large (usually 300-1000 integers). The table in question is live, and changes all the time. After creating the subscription with synchronize_structure := true, the table is copied, logical decoding starts, and then the replication fails with the error "ERROR: null value in column "mylist" violates not-null constraint"

It would appear that the large column is not properly encoded, defaults to null and as such causes an error for the not null column. Everything works fine if I replicate a simple table.

$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.4 LTS Release: 14.04 Codename: trusty

$ uname -a Linux db-master 3.13.0-49-generic #83-Ubuntu SMP Fri Apr 10 20:11:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

$ pg_lsclusters Ver Cluster Port Status Owner Data directory Log file 9.4 main 5432 online postgres /var/lib/postgresql/9.4/main /var/log/postgresql/postgresql-9.4-main.log 9.5 main 5433 online postgres /var/lib/postgresql/9.5/main /var/log/postgresql/postgresql-9.5-main.log

On the provider db: $ sudo -u postgres psql mydb create extension pglogical; SELECT pglogical.create_node(node_name := 'provider1', dsn := 'host=127.0.0.1 port=5432 dbname=mydb'); SELECT pglogical.replication_set_add_table('default', 'mytable');

On the subscriber db (clean database with no tables): $ sudo -u postgres psql -p 5433 mydb create extension pglogical; SELECT pglogical.create_node(node_name := 'provider1', dsn := 'host=127.0.0.1 port=5432 dbname=mydb'); SELECT pglogical.create_subscription(subscription_name := 'subscription1', provider_dsn := 'host=127.0.0.1 port=5432 dbname=mydb', synchronize_structure := true);

Log output for provider: 2016-06-01 08:56:25 UTC LOG: logical decoding found consistent point at 650/430C4050 2016-06-01 08:56:25 UTC DETAIL: There are no running transactions. 2016-06-01 08:56:25 UTC LOG: exported logical decoding snapshot: "255AA3A4-1" with 0 transaction IDs 2016-06-01 08:56:54 UTC LOG: duration: 28623.776 ms statement: COPY "public"."mytable" TO stdout 2016-06-01 08:56:55 UTC LOG: starting logical decoding for slot "pgl_mydb_provider1_subscription1" 2016-06-01 08:56:55 UTC DETAIL: streaming transactions committing after 650/430C4088, reading WAL from 650/430C4050 2016-06-01 08:56:55 UTC LOG: logical decoding found consistent point at 650/430C4050 2016-06-01 08:56:55 UTC DETAIL: There are no running transactions. 2016-06-01 08:56:55 UTC LOG: could not receive data from client: Connection reset by peer 2016-06-01 08:56:55 UTC LOG: unexpected EOF on standby connection

Log output for subscriber: 2016-06-01 10:56:25 CEST [22723-1] LOG: starting apply for subscription subscription1 2016-06-01 10:56:55 CEST [22727-1] postgres@mydb LOG: duration: 29860.347 ms statement: COPY "public"."mytable" FROM stdin 2016-06-01 10:56:55 CEST [22723-2] ERROR: null value in column "mylist" violates not-null constraint 2016-06-01 10:56:55 CEST [22723-3] DETAIL: Failing row contains removed. 2016-06-01 10:56:55 CEST [16095-8] LOG: worker process: pglogical apply 84094:1763399739 (PID 22723) exited with exit code 1 2016-06-01 10:57:00 CEST [23097-1] LOG: starting apply for subscription subscription1 2016-06-01 10:57:00 CEST [23097-2] ERROR: null value in column "mylist" violates not-null constraint 2016-06-01 10:57:00 CEST [23097-3] DETAIL: Failing row contains removed. 2016-06-01 10:57:00 CEST [16095-9] LOG: worker process: pglogical apply 84094:1763399739 (PID 23097) exited with exit code 1

The failing row (in the subscriber log) indeed has the column mylist as null, but when I check the original database, the column is, of course, not null, since it has the "not null" constraint. It always fails with the error for the same large column (mylist), but different rows each time I initiate the create_subscription (depends on which rows are changed during the initial table copy).

jstuhli commented 8 years ago

Here's a sample test case:

On provider: CREATE TABLE t1 (a INTEGER not null, b integer not null, c integer[] not null, PRIMARY KEY (a)); SELECT pglogical.replication_set_add_table('default', 't1'); insert into t1 values (1, 1, '{25496,6076,38628,6115,2931,4513,4194,21780,2383,2312,7246,38546,11111,10910,4043,28509,15420,42,52,5885,5285,2930,15884,11109,9154,1039,38154,6077,19,10185,9602,16484,161,22435,6159,11565,14298,22439,6165,22197,15969,40408,689,2137,7071,3328,34884,12944,6206,25508,2934,2467,32493,1060,230,2152,23609,22429,39257,38922,13752,11564,9362,38158,14812,10172,10914,6456,4417,6806,32558,5693,3801,5896,77,6208,11041,7273,4361,6911,5883,30716,7157,4193,4045,10918,7788,818,47098,13316,10909,5799,2935,42475,5801,3957,11566,6805,38006,38189,11559,7952,1688,37,18163,7275,21398,1763,13108,2248,5397,27150,38547,17,22436,16416,5286,3092,40409,15883,24177,7282,7250,11488,11114,2755,40174,11492,9970,19635,4196,1760,3,42049,6117,14732,9431,29004,21000,7239,6013,19107,31103,51230,11880,966,21400,15882,10236,13911,6113,1085,23608,13021,27793,13099,1339,1467,32503,1,20056,5804,10921,207,31985,30027,5348,21779,3263,30032,21776,21762,2780,22196,35502,11558,7514,20206,39099,27219,9197,32557,32490,7249,1506,10184,12657,2552,22194,51305,12681,11560,9895,4195,10772,15419,7244,6795,35,7953,10916,12685,10560,38187,11037,4594,13747,38924,7256,11947,6078,6207,40159,6238,1185,35489,32494,2386,22198,7280,13755,1758,50060,12456,3770,2,50061,2551,29,2105,15545,7238,3091,5347,29797,7245,7064,6119,50556,11936,7724,7281,3284,11881,309,1010,12455,11042,1434,7253,38,23610,11392,29019,22430,15881,66,2951,9996,5842,33,38008,6116,884,11944,7252,7242,30033,14062,13748,681,35490,8519,62,11113,21399,6205,6573,15207,2928,4508,11036,40175,31987,11561,5284,4848,6164,69,6438,3293,2950,50059,31986,14740,2307,11943,3264,34363,2468,11487,1820,32609,9291,23667,282,9364,38190,4106,6160,32491,4584,12683,29643,9323,14252,5698,54,11563,30030,6114,11879,55,47109,22735,1240,5884,14299,6804,1469,22432,14736,6997,7725,142,11134,280,86,27292,9428,9363,38007,1045,666,10911,249,63,38621,36298,32489,23670,34954,38188,6230,7255,4837,38157,32511,4509,15,13912,7723,32610,1307,39259,5696,7276,16096,217,2781,1504,12945,11883,3799,2348,9997,4044,4366,3802,838,10917,4600,6210,2553,1757,21778,4452,40407,13754,10913,38156,7279,36,41175,8478,5695,20554,6209,11490,2310,38159,37876,11934,2929,9513,8514,42150,6922,43,40157,1761,24178,48,1489,35000,10561,7066,13753,6833,31575,37706,9961,15114,6162,16097,1686,14735,47389,23668,30031,4087,14734,21777,4512,5886,11878,38153,2183,7065,11946,10743,885,4511,1435,32492,22431,7643,12680,1462,16255,10186,3955,2313,6797,3330,40147,3973,11945,6796,12678,7240,6692,6808,73,22195,7254,11562,14297,2349,37875,231,8479,27149,5213,9869,5283,4088,7272,5214,8477,6163,23,3960,10919,2783,3800,16166,3959,1465,84,22428,38548,40302,30029,11937,1764,2311,22438,15326,7247,6799,4089,13106,11491,25497,15635,13910,30509,6016,40405,38545,38012,2308,6289,3956,1187,7274,16098,1186,20207,6800,11112,11396,6798,14739,22261,11933,38627,11493,8273,6245,4105,2305,11115,9891,5794,3870,38622,20055,1685,11940,11877,40154,7243,35503,22184,23607,3739,20556,32556,3888,14300,10920,10235,5,38623,8517,12679,34,39098,38625,11939,7251,38160,18164,6079,12684,39,30557,150,2933,11941,72,40155,29795,20054,2465,22434,7793,38009,10066,2309,6161,21763,82,3954,12656,2932,34883,15878,7277,11949,11882,83,6801,18922,1507,10199,5712,1463,8528,40303,25426,7241,12,40410,10895,21019,1687,9361,38624,4042,11935,6169,6118,34364,3961,2782,51350,38923,11948,4602,30034,3958,5805,2949,2396,1505,5694,12682,7048,16,5699,6803,121,6007,42476,30077,14737,15327,4764,10912,1464,23533,13107,10908,31104,22437,18,10065,32005,6168,14733}');

On subscriber: CREATE TABLE t1 (a INTEGER not null, b integer not null, c integer[] not null, PRIMARY KEY (a)); SELECT pglogical.create_subscription(subscription_name := 'subscription1', provider_dsn := 'host=127.0.0.1 port=5432 dbname=mydb');

Everything works fine, the table is copied.

If we now do (on the provider) "UPDATE t1 SET b=2 WHERE a=1;"

this will show up in the subscriber log: 2016-06-01 13:24:06 CEST [6258-2] ERROR: null value in column "c" violates not-null constraint 2016-06-01 13:24:06 CEST [6258-3] DETAIL: Failing row contains (1, 2, null). 2016-06-01 13:24:06 CEST [16095-41] LOG: worker process: pglogical apply 95450:1763399739 (PID 6258) exited with exit code 1

I'm using the latest pglogical from the APT repo: $ dpkg -l | grep pglogical ii postgresql-9.4-pglogical 1.1.0-1trusty amd64 PGLogical plugin for PostgreSQL 9.4 ii postgresql-9.4-pglogical-output 1.1.0-2trusty amd64 PGLogical Output plugin for PostgreSQL 9.4 ii postgresql-9.5-pglogical 1.1.0-1trusty amd64 PGLogical plugin for PostgreSQL 9.5 ii postgresql-9.5-pglogical-output 1.1.0-2trusty amd64 PGLogical Output plugin for PostgreSQL 9.5

PJMODOS commented 8 years ago

Looks like issue with replicating TOASTed values, the e860571cf9b65f27068f3175bd5484f9e6720f05 should fix it. I think this warrants new bugfix release so I will coordinate with rest of the team to get it out soon.

PJMODOS commented 8 years ago

Okay so 1.1.2 was released with fix for this, please update your installation.