ckan / datapusher

A standalone web service that pushes data files from a CKAN site resources into its DataStore
GNU Affero General Public License v3.0
77 stars 154 forks source link

Use PostgreSQL instead of sqlite #199

Closed jqnatividad closed 4 years ago

jqnatividad commented 4 years ago

Implements #198 and fixes #129.

mbocevski commented 4 years ago

@jqnatividad I think that for most users of datapusher, sqlite is good enough and should be default. Considering SQLAlchemy is used, different backend are supported. So I would suggest to just update the documentation about using Postgres as a backend for datapusher and not change the default sqlite configuration.

jqnatividad commented 4 years ago

@mbocevski thanks for your feedback. However, sqlite won't work for a truly asynchronous, concurrent datapusher because you will get a "sqlite Operational Error: Database is Locked error" as it cannot support more than one thread writing to the database.

https://stackoverflow.com/questions/3172929/operationalerror-database-is-locked/3172950

You can make it work by increasing the timeout, but then doing so effectively makes datapusher single-threaded again, even if you configure uwsgi to make it concurrent (https://github.com/ckan/datapusher/pull/201).

mbocevski commented 4 years ago

Yes that is clear, however I'm suggesting that a lot of users might not need datapusher to be concurrent and I'm suggesting to document how to make datapusher concurrent and provide configuration for those that need it to be concurrent. WDYT?

jqnatividad commented 4 years ago

@mbocevski I'll amend the PR so that the user can be given both options and document the implications of each.

BTW, I still think the sqlite db should not be stored in /tmp as you loose the job history between reboots as documented in #129.

mbocevski commented 4 years ago

@jqnatividad Yeah I agree with that. Considering lots of folks would be deploying datapusher on the same box as CKAN. maybe we should store it by default in /var/lib/ckan?

amercader commented 4 years ago

@jqnatividad @mbocevski please check #207, particularly the new section that describes this setup