toluaina / pgsync

Postgres to Elasticsearch/OpenSearch sync
https://pgsync.com
MIT License
1.11k stars 174 forks source link

Bootstrap command breaks when using a newer version of PGSync for an existing PGSync setup #433

Open ZeyadYasser opened 1 year ago

ZeyadYasser commented 1 year ago

PGSync version: 2.5.0

Postgres version:

Elasticsearch version:

Redis version:

Python version: 3.8

Problem Description: The problem happens when running bootstrap command using a newer version of PGSync for an existing PGSync setup.

The problem was introduced by a new version of PGSync specifically this commit. It forces the check of the columns for the internal materialized view it created public._view.

Older versions of PGSync didn't have the indices column in the materialized view. The indices column was introduced on 13/12/2022, in this commit.

PGSync is not taking into account existing materialized views it created without the indices column (i.e. not backward compatible). This is a bug because the check runs and raises the exception (in Sync object init function) before bootstrap process gets to drop/create the materialized view with the correct column.

Error Message (if any):

PGSync bootstrap error: Traceback (most recent call last):
  File "/usr/local/bin/bootstrap", line 69, in <module>
    main()
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/bin/bootstrap", line 58, in main
    sync: Sync = Sync(
  File "/usr/local/lib/python3.8/site-packages/pgsync/singleton.py", line 17, in __call__
    cls._instances[key] = super(Singleton, cls).__call__(
  File "/usr/local/lib/python3.8/site-packages/pgsync/sync.py", line 99, in __init__
    self.validate(repl_slots=repl_slots)
  File "/usr/local/lib/python3.8/site-packages/pgsync/sync.py", line 178, in validate
    raise RuntimeError(
RuntimeError: Required materialized view columns not present on _view. Please re-run bootstrap.
Unable to start PGSync: undefined

A quick fix for me and for anyone facing the same problem was locking PGSync to version 2.3.3 (i.e before the indices column was introduced)

Thank you!

toluaina commented 1 year ago

This has now been fixed in the main branch. Sorry about this

ZeyadYasser commented 1 year ago

@toluaina Thank you!

bkleef commented 9 months ago

@toluaina we have exactly the same issue. Is it possible to release a new version so we can check that out with pip in our Dockerfile?

bkleef commented 9 months ago

In any case, the command below works in the meantime:

pip install pip install https://github.com/toluaina/pgsync/archive/main.zip