go-spatial / tegola

Tegola is a Mapbox Vector Tile server written in Go
http://tegola.io/
MIT License
1.25k stars 193 forks source link

Layer config using information_schema from Postgres #613

Open stdmn opened 5 years ago

stdmn commented 5 years ago

Moving this from #58 as it's a slightly different request:

@ARolek Circling back on this thread, I was wondering if it would be possible to dynamically pull layers from a Database's information_schema for the tegola_lambda build.

For instance, what if we could pass a SQL query to the config under provider, define a common geometry and index field and it would automatically grab all layers from the information_schema for that provider?

My specific use case for this is that I have a Lambda function to take user uploads and push them to an AWS Postgres DB. I would then like to pull those tables back into my app as a Mapbox layer using Tegola-served tiles. I've found that when I try to hot-swap the config.toml files, there is a delay in the app registering this so my thought is that if I can register the layers in realtime, the app will more quickly pick up on changes to the DB.

Here's my thought on what the config would look like:

[[providers]]
name = <NAME>
type = "postgis"
host = <HOST>
port = 5432
database = <DB>
user = <USER>
password = <PW>
srid = 4326
max_connections = 50
layer_sql = "select * from information_schema.tables where table_type = 'BASE TABLE' and table_schema = 'public' and table_name != 'spatial_ref_sys'"
layer_geometry_fieldname = "geom"
layer_id_fieldname = "objectid"

Or even better, the loop could actually find the id and geometry fields from the information_schema:

select column_name from information_schema.columns 
where table_name in (provider.layer_sql)
and udt_name = 'geometry'
select column_name from information_schema.columns 
where table_name in (provider.layer_sql)
and is_identity = 'YES'

Any pointers on which files/function handle the registering of layers so that I can add this?

ARolek commented 5 years ago

@stdmn apologies for the delayed response, I had a family emergency. I'm back now and catching up.

This idea has been discussed a few times and I think it would be incredible to implement. One other relevant issue for reference: Automated way to generate a config.toml.

The initConfig method is a great place to review as the routine we're going to need to implement is similar. I have a few ideas on how to best tackle this (incoming brain dump):

The first step would be establishing a new provider interface that is responsible for returning config data structures. We could maybe call the interface AutoConfiger or Configurator. I go back and forth if the return type should use the toml package since that's what we're using for the config format, or if we should have a more general structure like a map[string]interface{}{} and massage the data into the toml format at a higher level. This would open up support for other config file formats if we ever choose to go that route.

In the case of PostGIS, the next step is getting the connection string from the user and passing it to the provider so the info schema can be inspected like you suggested. We might need to revisit the NewTileProvider routine for the provider or plumb through another method specifically for this task. I lean towards the latter since the NewTileProvider method is already responsible for a lot. I would just want to make sure we're not duplicating initialization code.

The next step is to either instantiate the atlas package with the config data that's been generated or write the data to a file so the user could refine it as they wish.

Now for the last detail. If I understand you correctly, you're looking for this process to run on every request so the layer data is always fresh. I think this could be accomplished with a command-line flag (i.e. --force-refresh-config, bad name, but you get the idea) plumbed through the right places. This would essentially be a "config refresh" on every request. I'm not quite sure what the consequence of this will be but it's worth testing.

This is a rough overview as there are still some implementation details that will need to be figured out. I think this would be a really useful feature and would make getting started with tegola much easier. Note that all table attributes are going to be encoded with the features so more data than necessary will likely be encoded, though this can always be refined by the user.

I'm happy to help with this feature if you would like to take the lead. It might be easier to handle this conversation in our slack channel. Thanks for the great writeup and discussion.

stdmn commented 5 years ago

Sorry to hear that. Hope things are well.

Thanks for the info! Will start digging into this. One thought on --force-refresh-config: I think that the best case scenario would be if it would only force-refresh when the data has changed. This would probably require a timestamp showing last modified and then the browser would check whether the cache is newer or older than the last modified date for a given table.

Looks like we could use Postgres' commit timestamps (post 9.5) to get this info.

ingenieroariel commented 4 years ago

Does trex allow this? It would be good to take a look at their implementation if the license allows it.

On Fri., 19 Jul. 2019, 9:24 am stdmn, notifications@github.com wrote:

Sorry to hear that. Hope things are well.

Thanks for the info! Will start digging into this.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/go-spatial/tegola/issues/613?email_source=notifications&email_token=AAANNVYJQTPRMTA7VXME42DQAHFA7A5CNFSM4IC5PBZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2LY3AI#issuecomment-513248641, or mute the thread https://github.com/notifications/unsubscribe-auth/AAANNV6QV72FPJERLCWGRUDQAHFA7ANCNFSM4IC5PBZQ .

ARolek commented 4 years ago

@ingenieroariel I believe t-rex does but I have not checked their implementation.

@stdmn hmm, it would be ideal if it refreshed only when the data changed but I'm not quite sure how we can pull this off and be provider agnostic. I will need to think on that piece.

ingenieroariel commented 4 years ago

At least for gpkg you can sha256 the file?

For Postgres it may be good to have a PG_READ_ONLY flag that can assume the tegola process will be reloaded if the data changed

-a

On Fri., 19 Jul. 2019, 5:35 pm Alexander Rolek, notifications@github.com wrote:

@ingenieroariel https://github.com/ingenieroariel I believe t-rex does but I have not checked their implementation.

@stdmn https://github.com/stdmn hmm, it would be ideal if it refreshed only when the data changed but I'm not quite sure how we can pull this off and be provider agnostic. I will need to think on that piece.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/go-spatial/tegola/issues/613?email_source=notifications&email_token=AAANNV53SMNIVTNQ4KRI6MDQAI6THA5CNFSM4IC5PBZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2M57IY#issuecomment-513400739, or mute the thread https://github.com/notifications/unsubscribe-auth/AAANNV4H4WTZM5KIF5CF3YDQAI6THANCNFSM4IC5PBZQ .