pglogical.synchronize_sequence syncs incorrect value to subscriber (source value + 1000)

dba-brewer commented 6 years ago

I am trying to use pglogical to migrate a Postgres 9.5 cluster running on FC24 to a new server running Postgres 9.6 on CentOS 7. It seems to be working “ok”, but there is an odd thing happening where the destination sequence’s last_value is the source’s last_value+1000. (e.g. source has last_value of 3, destination has last_value of 1003.)

Searching online I found someone else running into this issue ( https://groups.google.com/a/2ndquadrant.com/forum/#!topic/bdr-list/7y4zY09bLBA) and a developer at 2nd Quadrant asked for an issue to be opened in github, so here I am.

My process thus far is:

Source/Provider System

-bash-4.3$ dnf list installed | grep postg postgresql.x86_64 9.5.4-1.fc24 @updates postgresql-contrib.x86_64 9.5.4-1.fc24 @updates postgresql-libs.x86_64 9.5.4-1.fc24 @updates postgresql-pgpool-II.x86_64 3.5.2-1.fc24 @@commandline postgresql-pgpool-II-extensions.x86_64 postgresql-pgpoolAdmin.noarch 3.5.3-3.fc24 @@commandline postgresql-server.x86_64 9.5.4-1.fc24 @updates postgresql-upgrade.x86_64 9.5.4-1.fc24 @updates

Ran initdb to create the cluster and made the required changes to the postgresql.conf file:

shared_preload_libraries = 'pglogical'
wal_level = logical
max_wal_senders = 10
wal_keep_segments = 64
max_worker_processes = 10
max_replication_slots = 10

Add entries for the destination IP for both local and replication connections to the pg_hba.conf file on source server:

host    test        postgres        x.x.x.x/32        trust

host     replication    postgres        x.x.x.x/32        trust

Restart the postgres cluster on the source and destination servers.

Create the pglogical extension in the template1 database (so pglogical extension will automatically be created in all subsequent databases created):

CREATE EXTENSION pglogical;

On the source server, create the provider node. For example:

select pglogical.create_node(
   node_name := 'test_provider',
   dsn := 'host=host1 port=5432 dbname=test'
);

All tables being migrated are in the public schema, so add them to the default replication set:

SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']);

Add all sequences to the default replication set as well:

select pglogical.replication_set_add_all_sequences('default', ARRAY['public'], true);

Destination/Subscriber System

-bash-4.2$ yum list installed | grep postg postgresql.x86_64 9.2.18-1.el7 @base postgresql-contrib.x86_64 9.2.18-1.el7 @base postgresql-libs.x86_64 9.2.18-1.el7 @base postgresql95.x86_64 9.5.10-1PGDG.rhel7 @pgdg95 postgresql95-contrib.x86_64 9.5.10-1PGDG.rhel7 @pgdg95 postgresql95-libs.x86_64 9.5.10-1PGDG.rhel7 @pgdg95 postgresql95-pglogical.x86_64 2.1.1-1.el7 @pglogical postgresql95-server.x86_64 9.5.10-1PGDG.rhel7 @pgdg95

Ran initdb to create the cluster and made the required changes to the postgresql.conf file:

shared_preload_libraries = 'pglogical'
wal_level = logical
max_wal_senders = 10
wal_keep_segments = 64
max_worker_processes = 10
max_replication_slots = 10

Add an entry for the destination IP (i.e. its own IP) to the pg_hba.conf file on destination server:

host    test        postgres        x.x.x.x/32        trust

Restart the postgres cluster on the source and destination servers. Create the pglogical extension in the template1 database:

CREATE EXTENSION pglogical;

On the destination server, create the database you are migrating:

CREATE DATABASE test;

Create the subscriber node:

select pglogical.create_node(
   node_name := 'test_subscriber',
   dsn := 'host=host2 port=5432 dbname=test'
);

Create the subscription:

SELECT pglogical.create_subscription(
   subscription_name := 'test_subscription',
   provider_dsn := 'host=host1 port=5432 dbname=test',
   synchronize_structure := true
);

After executing this last step (create_subscription), the test database on the source server is replicated to the destination server. Mostly. If I look at the objects, the tables, views, indexes, sequences are there. However, users are not. This is expected since users are created globally and pglogical operates on individual databases. However, looking at the sequences, I see they exist, but their last_value differs from source to destination. For example:

Source test=# select * from users_id_seq; sequence_name | last_value | start_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ---------------+------------+-------------+--------------+---------------------+-----------+-------------+---------+-----------+----------- users_id_seq | 3 | 1 | 1 | 9223372036854775807 | 1 | 1 | 32 | f | t (1 row)

Destination

test=# select * from users_id_seq; sequence_name | last_value | start_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ---------------+------------+-------------+--------------+---------------------+-----------+-------------+---------+-----------+----------- users_id_seq | 1 | 1 | 1 | 9223372036854775807 | 1 | 1 | 0 | f | t (1 row)

This happened even though I ran the replication_set_add_all_sequences on the source prior to creating the subscription. Also note, I set synchronize_data to true so the sequence values should have been synchronized immediately.

Ok, so try to manually synchronize them again. On the provider node, I ran the following:

SELECT pglogical.synchronize_sequence('users_id_seq');

Now when I check the last_value:

test=# select * from users_id_seq; sequence_name | last_value | start_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ---------------+------------+-------------+--------------+---------------------+-----------+-------------+---------+-----------+----------- users_id_seq | 1003 | 1 | 1 | 9223372036854775807 | 1 | 1 | 0 | f | t (1 row)

For some reason, it syncs the source value and then adds 1000 to it. I tried this on numerous sequences and the same thing happened. The destination’s last_value is set to source’s last_value+1000.

dba-brewer commented 6 years ago

If anyone else runs into this and needs to mass-resync the sequences, I ran the following to get the sync statements:

select 'SELECT pglogical.synchronize_sequence(''' || relname || ''');' as sql 
from pg_catalog.pg_statio_user_sequences 
order by sql;

Granted, the last_values will all be larger than the original by 1000, but I guess it's better than being smaller by 1000. Nobody wants to angry up a PK constraint, least of all on a Friday afternoon. :)

huayuanzzb commented 5 years ago

I am testing PG upgrade from 9.5 to 10.6 using pglogical 2.2.0 and running into the same issue.

saurabhnanda commented 5 years ago

Ran into the same issue. Is the README trying to say that this is expected behaviour and has been done deliberately?

The state of sequences added to replication sets is replicated periodically and not in real-time. Dynamic buffer is used for the value being replicated so that the subscribers actually receive future state of the sequence. This minimizes the chance of subscriber's notion of sequence's last_value falling behind but does not completely eliminate the possibility.

lknaubkonrad commented 4 years ago

I'm encountering this as well. Renders the replicate sequences useless.

PJMODOS commented 4 years ago

Ran into the same issue. Is the README trying to say that this is expected behaviour and has been done deliberately?

The state of sequences added to replication sets is replicated periodically and not in real-time. Dynamic buffer is used for the value being replicated so that the subscribers actually receive future state of the sequence. This minimizes the chance of subscriber's notion of sequence's last_value falling behind but does not completely eliminate the possibility.

Yes this is by design. Same like with physical replication (there you get value +32 though so it's harder to spot).

lknaubkonrad commented 4 years ago

The tables that are being replicated are slowly changing, low volume. The sequence's increment and cache are set to 1. After inserting 5 records between 2 nodes, the sequence numbers are jumping from 1 to 1003 to 10003 to 151865. I can understand there may be a buffer but reaching a sequence number of 151,865 within inserting 4 or 5 records seems irregular and makes it difficult estimate whether an adequate number of sequences will be available. When the increments are so random how does one estimate that you won't run out of sequence numbers?

jjb commented 4 years ago

@dba-brewer thanks for your snippet - why would it add 1000?

jjb commented 4 years ago

@dba-brewer this code sync all sequences:

select pglogical.synchronize_sequence( seqoid ) from pglogical.sequence_state;

lknaubkonrad commented 4 years ago

I have been using 'select pglogical.synchronize_sequence(<'name of sequence'>);. Synching the sequences has not been an issue but rather the values being generated. I can insert records on both Masters at 4pm one day, the sequences sync and the next morning when I check the sequence last value it has jumped to a value of 1million and there's been no CRUD activity since 4pm the prior day.

jstaf commented 4 years ago

I did some digging and it looks like the synchronize_sequence bug is by design. It would be nice if we could change this.

When I look at the sequence_state table, I can see that cache_size is 1000 for every sequence:

SELECT * FROM pglogical.sequence_state;

  seqoid  | cache_size | last_value 
----------+------------+------------
 14860330 |       1000 |      32695
  5995883 |       1000 |      30310
 14860084 |       1000 |       6401
   168877 |       1000 |     407736

I dug through what's happening in the source code, and it adds the value of cache_size to whatever the last_value was when synchronizing sequences. https://github.com/2ndQuadrant/pglogical/blob/19d6f4f93b0985bb167f00953689f0ed489647fb/pglogical_sequences.c#L233

What's more, it looks like the cache_size is set as a constant here: https://github.com/2ndQuadrant/pglogical/blob/19d6f4f93b0985bb167f00953689f0ed489647fb/pglogical_sequences.c#L41

lknaubkonrad commented 4 years ago

I agree with your finding above. When a record can be inserted with a sequence number of 1 at 4pm on Day 1 and the next morning the last sequence value is 1000000 (there's been no transactions since 4pm the prior day) would seem to make it difficult to guarantee one will not run out of sequence numbers.

jstaf commented 4 years ago

It might be worth seeing what happens when you manually change cache_size to 0 in the pglogical.sequence_state table (as far as I know that might theoretically fix things), but I wasn't quite brave enough to try that out in production.

pratmeht commented 4 years ago

Hi, after reading all the comments above I see that I am not the only one experiencing such weird behavior of Sequences. Good diagnosis "@jstaf". Any recommendation for fix ? or shall I try above suggestion of setting cache_size to 0 ? Has anybody tried ? or SEQUENCE replication continues to misbehave on pglogical ?

rootwyrm commented 3 years ago

On 2.3.3 with Pg 13, I'm running into a more severe form of this problem. cache_size is not 1000, it's 1000000. This means an int can run out in short order. This is what the sequences look after just 20 minutes, with NO activity on the database:

demo4=# SELECT * FROM pglogical.sequence_state;
 seqoid | cache_size | last_value
--------+------------+------------
  16605 |    1000000 |  556045002
  16631 |    1000000 |  556045001
  16659 |    1000000 |  556045001
  16672 |    1000000 |  556045001
(4 rows)

Since this is throwaway, I altered the cache_size to 1. This does not actually fix it; pglogical is internally changing it and forcing it to creep back upwards.

demo4=# SELECT * FROM pglogical.sequence_state ;
 seqoid | cache_size | last_value
--------+------------+------------
  16605 |          1 |  600045002
  16631 |          1 |  600045001
  16659 |          1 |  600045001
  16672 |          1 |  600045001
(4 rows)
--
-- Wait about 5 minutes
-- 
demo4=# SELECT * FROM pglogical.sequence_state ;
 seqoid | cache_size | last_value
--------+------------+------------
  16605 |          4 |  602045008
  16631 |          4 |  602045007
  16659 |          4 |  602045007
  16672 |          4 |  602045007
(4 rows)
--
-- Wait another 5 minutes...
--
demo4=# SELECT * FROM pglogical.sequence_state ;
 seqoid | cache_size | last_value
--------+------------+------------
  16605 |         16 |  604045032
  16631 |         16 |  604045031
  16659 |         16 |  604045031
  16672 |         16 |  604045031
(4 rows)

This seems... not right. @PJMODOS is the automagic altering of cache_size upward intended behavior? This sequence behavior causes some frankly very nasty regressions for migrating from PUB/SUB.

lpossamai commented 1 year ago

I'm facing the same issue.

The sequences last_value are not replicated. But when I run synchronize_sequence, I get the correct value from source.

By using replication_set_add_all_sequences with synchronize_data = true, shouldn't it automatically sync that for me?

I see @rootwyrm 's comment on the cache_size and I agree this should be a feature request.

lpossamai commented 1 year ago

EnterpriseDB mention that the sequences are not replicated in real-time here.

2ndQuadrant / pglogical

pglogical.synchronize_sequence syncs incorrect value to subscriber (source value + 1000) #163