Rethink dbcredentials.sh and related data.

go-spatial / tegola-osm

Various scripts for importing and running a mirror of OSM with tegola

https://demo.tegola.io

MIT License

73 stars 26 forks source link

Rethink dbcredentials.sh and related data. #25

Closed erictheise closed 5 years ago

erictheise commented 6 years ago

osm_land.sh inserts data into the tegola-owned database containing the osm.pbf import.

natural_earth.sh creates a database outside of the osm.pbf import data. I'll argue that this data may well be of use to the analyst outside of the tegola context and should/could be owned by a different user. It's sensible that a PostgreSQL cluster only needs one copy of Natural Earth Data.

Isn't it the same for the osm land polygon data? One copy could serve multiple metro extracts? One copy could be used in contexts other than tegola?

ARolek commented 6 years ago

@erictheise I agree, the dbcredentials.sh file is just a way to pass credentials around to different scripts, but the end user will probably want to use different databases for each dataset. Maybe we have explicit vars in the dbcredentials.sh for the dataset / database pair so the user can decide where the data lives? This opens up the question of if there are different databases for each dataset, we should probably expect different credentials per database. hmmm.

erictheise commented 6 years ago

Some reference to Twelve-Factor Apps flew by on the Slack channel or in a commit message, @ARolek, and I think that should shape the discussion.

My personal setup will have a separate pg database for each project, owned by the tegola role, with data based on one or more metro extracts; each tegola.conf will live in the project's repo. I'll want one tegola executable, one copy of the Natural Earth Data owned by postgres or other broader role in my cluster, and one copy of the OSM polygon data; at the moment, scripts and docs suggest that there'd be polygon data for each project, which should not be necessary.

I've tended to use the Foreman approach. It's been ported to Python and Node and a casual search reveals several Go implementations.

ARolek commented 6 years ago

So maybe we just need to update the readme file. the way the scripts are setup you can set the database name for each data set (i.e. osm, natural_earth). The osm land polygons also uses the osm database but that could be easily changed.

So if we want to go with the 12 factor app approach all the database credentials would just need to be derived from environment variables. the dbcredentials.sh file would then just set those environment variables. We would need to move the following chunk of code above the credential declarations:

if [ -r dbcredentials.sh ]
then
     source dbcredentials.sh
fi

erictheise commented 6 years ago

Forgot to say that one simpler convention I like is to have example files in the repo, e.g., tegola.conf-example, with placeholders for any local or sensitive data. .gitignore would, out of the box, include the names of the real config files to help prevent sensitive data slipping into a repo. The README should explain (and the code should fail to execute with a message) that people need to copy and rename -example, then add locally relevant configs.

Sidesteps environment variables and eliminates the need for dbcredentials; sidesteps the overhead of a Foreman clone.

I don't have a strong opinion one way or the other but it'd be great to have a single approach to specifying local info for the config files and scripts.

gdey commented 5 years ago

@erictheise Just going through all the issues... Now that Tegola support environment variables directly in the config, is this still an issue?

erictheise commented 5 years ago

Oh dear, @gdey, such an old issue. ISTR it had as much to do with the default import procedure creating a complete copy of OSM polygon & Natural Earth data for every imported OSM extract as it did with specifying credentials but I've not been keeping up with Tegola's evolution.