Closed jqnatividad closed 4 years ago
@jqnatividad I think that for most users of datapusher, sqlite is good enough and should be default. Considering SQLAlchemy is used, different backend are supported. So I would suggest to just update the documentation about using Postgres as a backend for datapusher and not change the default sqlite configuration.
@mbocevski thanks for your feedback. However, sqlite won't work for a truly asynchronous, concurrent datapusher because you will get a "sqlite Operational Error: Database is Locked error" as it cannot support more than one thread writing to the database.
https://stackoverflow.com/questions/3172929/operationalerror-database-is-locked/3172950
You can make it work by increasing the timeout, but then doing so effectively makes datapusher single-threaded again, even if you configure uwsgi to make it concurrent (https://github.com/ckan/datapusher/pull/201).
Yes that is clear, however I'm suggesting that a lot of users might not need datapusher to be concurrent and I'm suggesting to document how to make datapusher concurrent and provide configuration for those that need it to be concurrent. WDYT?
@mbocevski I'll amend the PR so that the user can be given both options and document the implications of each.
BTW, I still think the sqlite db should not be stored in /tmp
as you loose the job history between reboots as documented in #129.
@jqnatividad Yeah I agree with that. Considering lots of folks would be deploying datapusher on the same box as CKAN. maybe we should store it by default in /var/lib/ckan
?
@jqnatividad @mbocevski please check #207, particularly the new section that describes this setup
Implements #198 and fixes #129.