spacepy / dbprocessing

Automated processing controller for heliophysics data
5 stars 4 forks source link

Use of postgresql instead of sqlite #60

Open dnadeau-lanl opened 3 years ago

dnadeau-lanl commented 3 years ago

My suggested enhancement is ...

Allow users to use posgresql which is supported by SQLalchemy.

Relation to an issue

When many machines are running dbprocessing, using postgresql makes it easier to centralize a database to a main server where each process can communicate.

Proposed enhancement

Use postgresql and allow dbprocessing to access it.

Use/detect postgres environment variable to access a postgres database.

PGPORT=5432
PGUSER=use
PGDATABASE=dbprocessing
PGPASSWORD=*******
PGHOST=localhost

Alternatives

Use different sqllite database on different machines to process. Which means that the load balancing has to be done manually and only 1 machine will be allowed to process 1 instrument.

Version of dbprocessing

LANL version

Closure condition

dbprocessing can work with progresql without error.

jtniehof commented 3 years ago

14 is the first step of this.

Is PGPORT, etc. a standard for Postgres? (Documented anywhere?) As discussed in #14 it would be good to avoid magic environment variables that are specific to us and better to have a robust set of arguments (both Python-level and command line) that can fully specify the database and make it easy to build a sqlalchemy URL. That also would hopefully make support for other databases easier down the road.

I also have on my plate (was hoping to start this week but got into firefighting mode) to first cut out any scripts that aren't being used, and then do some consolidation of argument handling across all scripts, so that the code to parse out database-specification command line arguments lives in one place.