toluaina / pgsync

Postgres to Elasticsearch/OpenSearch sync
https://pgsync.com
MIT License
1.1k stars 172 forks source link

"TypeError: _process_bulk_chunk() got multiple values for argument 'raise_on_exception'" on dockerized PGSync initial execution #506

Open rjmAmaro opened 7 months ago

rjmAmaro commented 7 months ago

PGSync version: latest

Postgres version: (dockerized) 14

OpenSearch version: (dockerized) 2.11

Redis version: (dockerized) alpine:latest (7.2, probably)

Python version: 3.8

Problem Description:

PGSync started throwing this error when starting it with the Docker container:

TypeError: _process_bulk_chunk() got multiple values for argument 'raise_on_exception'

After this, the indexation of data doesn't occurs and the container stops.

It was working correctly previously.

I also tried to use a simple schema, with a single index for a single table, mapping just one column. The error persisted.

PGSync is Dockerized with the suggested Dockerfile, using the base image python:3.8. I also tried with the base images python:3 and python:3.9. The error persisted.

Error Message (if any):

pgsync-1  | 2023-11-23 17:16:03.543:ERROR:pgsync.search_client: Exception _process_bulk_chunk() got multiple values for argument 'raise_on_exception'
pgsync-1  | Traceback (most recent call last):
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/pgsync/search_client.py", line 133, in bulk
pgsync-1  |     self._bulk(
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/pgsync/search_client.py", line 188, in _bulk
pgsync-1  |     for _ in self.parallel_bulk(
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/opensearchpy/helpers/actions.py", line 485, in parallel_bulk
pgsync-1  |     for result in pool.imap(
pgsync-1  |   File "/usr/local/lib/python3.12/multiprocessing/pool.py", line 873, in next
pgsync-1  |     raise value
pgsync-1  |   File "/usr/local/lib/python3.12/multiprocessing/pool.py", line 125, in worker
pgsync-1  |     result = (True, func(*args, **kwds))
pgsync-1  |                     ^^^^^^^^^^^^^^^^^^^
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/opensearchpy/helpers/actions.py", line 487, in <lambda>
pgsync-1  |     _process_bulk_chunk(
pgsync-1  | TypeError: _process_bulk_chunk() got multiple values for argument 'raise_on_exception'
pgsync-1  | Traceback (most recent call last):
pgsync-1  |   File "/usr/local/bin/pgsync", line 7, in <module>
pgsync-1  |  0:00:01.970697 (1.97 sec)
pgsync-1  |     sync.main()
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
pgsync-1  |     return self.main(*args, **kwargs)
pgsync-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/click/core.py", line 1078, in main
pgsync-1  |     rv = self.invoke(ctx)
pgsync-1  |          ^^^^^^^^^^^^^^^^
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
pgsync-1  |     return ctx.invoke(self.callback, **ctx.params)
pgsync-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/click/core.py", line 783, in invoke
pgsync-1  |     return __callback(*args, **kwargs)
pgsync-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/pgsync/sync.py", line 1449, in main
pgsync-1  |     sync.pull()
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/pgsync/sync.py", line 1224, in pull
pgsync-1  |     self.search_client.bulk(
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/pgsync/search_client.py", line 133, in bulk
pgsync-1  |     self._bulk(
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/pgsync/search_client.py", line 188, in _bulk
pgsync-1  |     for _ in self.parallel_bulk(
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/opensearchpy/helpers/actions.py", line 485, in parallel_bulk
pgsync-1  |     for result in pool.imap(
pgsync-1  |   File "/usr/local/lib/python3.12/multiprocessing/pool.py", line 873, in next
pgsync-1  |     raise value
pgsync-1  |   File "/usr/local/lib/python3.12/multiprocessing/pool.py", line 125, in worker
pgsync-1  |     result = (True, func(*args, **kwds))
pgsync-1  |                     ^^^^^^^^^^^^^^^^^^^
pgsync-1  |   File "/usr/local/lib/python3.12/site-packages/opensearchpy/helpers/actions.py", line 487, in <lambda>
pgsync-1  |     _process_bulk_chunk(
pgsync-1  | TypeError: _process_bulk_chunk() got multiple values for argument 'raise_on_exception'
rjmAmaro commented 7 months ago

Changing the target commit to a previous one solves the issue:

(@ Dockerfile)

RUN pip install git+https://github.com/toluaina/pgsync.git@95116702c4b314d8b97696ef857cfe116241e236
asturm-fe commented 6 months ago

Having the same issue - is there a fix coming soon?

toluaina commented 6 months ago

Can you please send me your docker-compose.yaml? In the meantime I'll try to reproduce this

toluaina commented 6 months ago

are you by any chance installing pgsync from pypi in your Dockerfile?

nowfred commented 6 months ago

are you by any chance installing pgsync from pypi in your Dockerfile?

I experienced the same error as above - installed from pypi instead of Dockerfile approach. Pinning to the specific git commit solved the above issue.

asturm-fe commented 6 months ago

I'm currently running pgsynch outside of docker as I'm just testing around. I installed it via pip (python 3.x) and hat to remove and reinstall with the specific commit hash provided by @rjmAmaro to get rid of the error.

rjmAmaro commented 6 months ago

Not that it helps much, but I'm currently able to run the latest version on Dockerfile, with pyhton:3.8. No changes of configs that I'm aware of.