long2ice / meilisync

Realtime sync data from MySQL/PostgreSQL/MongoDB to Meilisearch
https://github.com/long2ice/meilisync
Apache License 2.0
287 stars 43 forks source link

Postgres `get_full_data` is slow for large datasets #105

Open MattExact opened 6 months ago

MattExact commented 6 months ago

I found that get_full_data was unusably slow for datasets >1m rows. I believe the culprit is the use of offset/limit pagination, which is not performant for large offsets.

I think a better implementation would be to use a server-side cursor and use fetchmany to fetch size number of rows at a time. See Psycopg docs for more.

long2ice commented 6 months ago

OK, could you please make a PR?