maplibre / martin

Blazing fast and lightweight PostGIS, MBtiles and PMtiles tile server, tile generation, and mbtiles tooling.
https://martin.maplibre.org
Apache License 2.0
2.3k stars 216 forks source link

WATCH_MODE info #288

Open iferencik opened 2 years ago

iferencik commented 2 years ago

Hello there

I'd like to know if setting WATCH_MODE=true makes martin reload the config file. I am publishing many postgis layer with even more properties/columns and in order to limit these i need to use a configuration file. I am thinking to create a python script that will generate the config yaml and it would be really nice if martin could re-read this in WATCH_MODE

thank you for this great software box

sharkAndshark commented 2 years ago

If this is on road map,can i try this?

stepankuzmin commented 2 years ago

Hi @sharkAndshark,

Sure!

sharkAndshark commented 2 years ago

Based on the discusion in https://github.com/maplibre/martin/issues/349 ,there will be no watch mode? @stepankuzmin

nyurik commented 2 years ago

@sharkAndshark I think we may need to rethink the watch mode, and possibly add it back before the next release.

That said, I really feel like the more substantial changes should be done with kubernetes and helm and deployments, and the watch mode is a development hack, making infrastructure too fragile. A better set up would be:

sharkAndshark commented 2 years ago

@nyurik I will give it a a shot and then more discussion

nyurik commented 2 years ago

Sounds good! @stepankuzmin any thoughts?

stepankuzmin commented 2 years ago

@sharkAndshark sure, that would be great 👍

pascal-codetaal commented 2 years ago

@nyurik We are currently using the latest version of martin, and there are some real benefits in using this version. But the only thing that is stil a downside in the new version, is the remove of the watch or the option to reload the index. We are running on a Kubernetes cluster and we reload the index by creating a new pod, and killing the other one. But this process can take up to 50 seconds or so, that's the time the user has to wait before, before being able to see the updated data. So is there a possibility to add another way to refresh the data? Perhaps we can try to help with implementing this feature?

nyurik commented 2 years ago

@pascal-codetaal I'm beginning to think we really do need to reimplement it, and some help would be great. So lets have a dedicated POST /reload endpoint.

My only concern is that we would want to avoid a global lock on each request. I'm not yet certain how to do that best.

iferencik commented 2 years ago

@nyurik We are currently using the latest version of martin, and there are some real benefits in using this version. But the only thing that is stil a downside in the new version, is the remove of the watch or the option to reload the index. We are running on a Kubernetes cluster and we reload the index by creating a new pod, and killing the other one. But this process can take up to 50 seconds or so, that's the time the user has to wait before, before being able to see the updated data. So is there a possibility to add another way to refresh the data? Perhaps we can try to help with implementing this feature?

@pascal-codetaal if you do a

    kubectl rollout restart deployment <depl_name> 

there is no downtime. kube will ensure the old pod(s) is/are working while the new one gets deployed.

Just make sure you have enough capacity on the node to spin a new deployment.

Also, you should ideally have more(>1) replicas.

I have the same issues and being under time pressure I resorted to something else I created a sh [script] (https://github.com/UNDP-Data/martin-docker/blob/main/reactive_entrypoint.sh) that checks the local config config file against a remote one located at a web accessible url (Azure file share).

An env vars specify how often the url/config is pulled and where the URL is pulled from.

This might not be very elegant but it works like a charm specially because i use a tool to create (optionally upload to azure) the config files. We actually update the layers on a daily, sometimes hourly basis. In future i will pool the remote config once a day.

The tool has a quirk that it creates configs only for tables that have the "publish=True" in the table description. The same is valid for columns, if you mark a column with publish=False the toll will skip it. This is not a very good idea and I think I will scrape it in favor of a more robust technique like revoking read rights for the user under which martin connects to the database.

pascal-codetaal commented 2 years ago

We don't have a downtime, it just can take up to 50 seconds before the new pod is up and running... We let users upload data to our db and they don't like to wait an extra 50 seconds before they can see the updated data. I'm going to check your script, thx

saosebastiao commented 1 year ago

I would love to be able to reload the schema config without a full restart. The PostgREST project has two cool ways to do this:

https://postgrest.org/en/stable/schema_cache.html#schema-cache-reloading

The second method in particular is interesting because it allows the DB admin to set up custom triggers that would send notifications to the custom channel any time a geospatial table changes.

nyurik commented 1 year ago

There are currently two issues I see (i'm sure there are more):

tordans commented 9 months ago

We just ran into this issue after switching from pg_tileserv to Martin. With pg_tileserv all new tables where discovered right away. Now, we have to restart our Martin Docker container to get an updated table list.

Our docker setup https://github.com/FixMyBerlin/atlas-geo/blob/main/docker-compose.yml starts our backend server with db, processing and tiles. It looks like we don't have a good way to restart the tile docker after the app image closed and finished processing our data…

Is there some API like <marinserver.com>/refresh that we could ping to notify Martin to refresh tables?

nyurik commented 9 months ago

@tordans we all want that api to happen - I plan to work on it in the future, but no certainty on the timeframe -- it is mostly a personal project, so if anyone has any time to work on it, I will be happy to guide and assist that effort.

pascal-codetaal commented 9 months ago

Hello @nyurik and the Martin contributors,

We've been closely following the discussion about the need for a watch mode or similar feature to enable dynamic config reloading in Martin. As we understand the significance of this feature for users, and us, who need to see updated data without substantial downtime, we're keen on contributing to its development.

Though we're not currently well-versed in Rust, we're ready to put in the necessary time to become proficient and contribute meaningfully to this project. We're particularly interested in understanding the preferred approach.

Could you provide us with some guidance on where you believe we should start, or if there are any particular areas of the codebase or documentation that we should familiarize ourselves with first?

We're looking forward to collaborating with the community on this feature. Thank you for considering our offer to help.

nyurik commented 9 months ago

@pascal-codetaal I will be happy to help you.

The very first thing I think we should evaluate is the performance aspect outlined in https://github.com/maplibre/martin/issues/288#issuecomment-1494739125 -- create a tiny Actix app using one of the examples they have as a starting point. Inside that app, create two endpoints, e.g. /lock and /nolock to see the impact of the global lock. You can use the oha tool (used in the justfile) to compare the performance of both endpoints.

This should get you familiarized with the Rust ecosystem, making it much easier to make changes to a bigger project.

#[derive(Clone)]
struct State {
    pub values: HashMap<String, String>,
    pub values_mutex: Arc<Mutex<HashMap<String, String>>>,
}

impl State {
    pub fn new() -> Self {
        let values = vec![(String::from("key"), String::from("value"))]
            .into_iter()
            .collect::<HashMap<String, String>>();
        Self {
            values_mutex: Arc::new(Mutex::new(values.clone())),
            values,
        }
    }
}

#[get("/lock")]
async fn get_lock(state: Data<State>) -> String {
    let vals = state.values_mutex.lock().unwrap();
    vals.get("key").unwrap().clone()
}

#[get("/nolock")]
async fn get_nolock(state: Data<State>) -> String {
    state.values.get("key").unwrap().clone()
}
pascal-codetaal commented 9 months ago

@nyurik the development will be done by @mesudBe-Orbit He will start with the development of the separate Actix application, and he will push this to a public repository.

mesudBe-Orbit commented 9 months ago

@nyurik @pascal-codetaal the development is complete, and you can access the tiny application and view the results by following this link: https://github.com/mesudBe-Orbit/actix-lock-stress-tester?tab=readme-ov-file#test-results In short, under these conditions, it seems there is no significant difference.

nyurik commented 9 months ago

awesome news, thanks! So this should make it easier.

The current config system process:

The resolve method creates cache and an ID Resolver, and uses resolve_tile_sources to process all tile sources in parallel.

https://github.com/maplibre/martin/blob/a864b9981c9b8cfeaf3b4a90d40ec2635674bd59/martin/src/config.rs#L115

IdResolver maps "unique full source IDs" (like database.namespace.tablename or even longer) to some short name the user may want, like tablename. If the short name already exists, it will append numbers to it, e.g. tablename.1, but if the same full unique name is passed to IdResolver, it will re-return the same short name as before - so my point above about stable resolution is mostly done - same table will get the same short name.

PostgreSQL resolution is much more involved than pmtiles/mbtiles. It runs these steps in parallel for all configured postgres connections:

https://github.com/maplibre/martin/blob/a864b9981c9b8cfeaf3b4a90d40ec2635674bd59/martin/src/pg/config.rs#L116

So one path to implement watch mode:

nyurik commented 9 months ago

@mesudBe-Orbit @pascal-codetaal the development is complete, and you can access the tiny application and view the results by following this link: https://github.com/mesudBe-Orbit/actix-lock-stress-tester?tab=readme-ov-file#test-results In short, under these conditions, it seems there is no significant difference.

Please run your code with cargo run --release, and also you should probably increase the number of requests, e.g. aim for about 5 minute run (?)

pascal-codetaal commented 9 months ago

@nyurik the 5 minute run had the same result for both endpoints. We started with the implementation of changing the endpoint functions.

Change all endpoint functions to accept sources: Data<RwLock> instead of Data, and get a read lock on all of them. Note that RwLock might not be good for our usecase, but it should get us started.

nyurik commented 9 months ago

@pascal-codetaal sounds good. One other thing - I would recommend creating a Martin fork, create a branch there, and create a draft pull request into martin. Make sure to commit and push your code often (even if it does not compile). This way everyone can see the progress, and make suggestions as you go, without waiting for the full completion, and to correct mistakes early on. This way you will also get the benefit of constant CI checks.

mesudBe-Orbit commented 9 months ago

@nyurik we have created a draft PR into martin. you can find it here And commit and push our latest code (the latest commit was about mistakenly added repetition).

nyurik commented 9 months ago

@mesudBe-Orbit yes, thanks, i saw it and already commented on that PR. Keep adding things to it (by pushing to the same branch), and once it is ready to be merged, click the Ready to review button. Make sure to either merge main branch into it, or rebase on main and force push.

nyurik commented 6 months ago

pmtiles support for this would be good as well. I outlined how it can be done in https://github.com/stadiamaps/pmtiles-rs/issues/40 - and once done, we can use it in Martin too.