ckan / datapusher

A standalone web service that pushes data files from a CKAN site resources into its DataStore
GNU Affero General Public License v3.0
77 stars 153 forks source link

Make datapusher uwsgi concurrent #201

Closed jqnatividad closed 4 years ago

jqnatividad commented 4 years ago

Fixes #200

By default, uwsgi runs with a single process and a single thread.

This changes it to 6 processes/workers with each process/worker having 15 threads, reflecting the same tweak @metaodi did in https://github.com/ckan/datapusher/issues/147#issuecomment-329413982 for the old Apache based deployment.

Along with using PostgreSQL instead of sqlite (PR #199), this has the added benefit of making datapusher throughput much higher as its now fully concurrent!

jqnatividad commented 4 years ago

After extensive testing under heavy load, added "lazy-apps" option.

Even though pscopg2 is thread-safe, its not fully "fork-safe", and was getting this error:

(psycopg2.OperationalError) SSL error: decryption failed or bad record mac

This happens because the connection was created BEFORE the fork, which is a problem per psycopg2 docs:

https://www.psycopg.org/docs/usage.html#thread-safety

Lazy-apps took care of the problem by making sure each worker loads a copy of the app, and not using a fork, so each worker gets its own connection pool.

For more info on 'lazy-apps', see https://uwsgi-docs.readthedocs.io/en/latest/articles/TheArtOfGracefulReloading.html

Also used workers synonym for processes and dialed down workers/threads to more conservative values.

mbocevski commented 4 years ago

@jqnatividad This looks good, and as discussed in #199, we should document to use single thread when using sqlite and use this config when using different backends like postgres. An idea would be to have the default be single thread and have the multi-thread options commented out and documented in the readme for people that want to enable that if they choose to use the postgres backend.

amercader commented 4 years ago

@jqnatividad @mbocevski please check #207, particularly the new section that describes this setup