Performance Improvement: deduplicate in-flight requests

saosebastiao commented 2 years ago

Hi just wanted to open up the discussion to an idea that I have. The problem I notice is that with many concurrent users arriving at the same time, a lot of duplicate requests can get issued to the database. For small requests, that's usually not a problem as they have very low latency. However, with large layers at low zoom levels, this can lead to a lot of different postgres processes doing full table scans.

Our current solution (and presumably a common solution) is HTTP caching. But caching opens up a huge amount of potential complexity for cache invalidation for layers that change frequently. I think my proposal could reduce this database burden without actually introducing a tile caching layer. It would do this with a small in-flight db request cache, keyed by a struct containing the Source,xyz, and Query values. This request cache will be active as long as the request is in flight. All requests issued which have requests in flight would receive async channel futures. As soon as the database responds, it would broadcast the result to the channels awaiting it, then clean up its' entry in the cache.

A minimal hello-world implementation of this idea is below:

use std::time::Duration;
use std::borrow::Borrow;
use std::collections::HashMap;
use std::sync::Arc;
use std::thread::sleep;

use actix_web::{App, get, HttpServer, Responder, web};
use dashmap::DashMap;
use futures::join;
use tokio;
use tokio::sync::broadcast;
use tokio::sync::broadcast::{Receiver, Sender};
use tokio::sync::broadcast::error::{RecvError, SendError};

async fn get_tile(id: i32) -> String {
    sleep(Duration::from_secs(id as u64));
    println!("running_now");
    return format!("Hello {id}!");
}
type RequestCache = Arc<DashMap<i32,Receiver<String>>>;

#[get("/hello/{secs}")]
async fn greet(cache: web::Data<RequestCache>, secs: web::Path<i32>) -> impl Responder {
    let key = secs.into_inner();
    if(!cache.contains_key(&key)){
        let (s, mut r) = broadcast::channel(1);
        cache.insert(key,r);
        let ret_val = get_tile(key).await;
        let _ = s.send(ret_val.clone());
        cache.remove(&key);
        return ret_val;
    }
    return cache.get(&key).unwrap().resubscribe().recv().await.unwrap();
}

#[actix_web::main] // or #[tokio::main]
async fn main() -> std::io::Result<()> {
    let cache: RequestCache = Arc::new(DashMap::new());
    HttpServer::new(move || {
        App::new()
            .app_data(web::Data::new(cache.clone()))
            .route("/hello", web::get().to(|| async { "Hello World!" }))
            .service(greet)
    })
    .bind(("127.0.0.1", 8080))?
    .run()
    .await
}

You can run the above code, and then in 2 (or more) different terminals, call curl http://localhost:8080/hello/10. The first request that you send will take a full 10 seconds to fulfill, but if you send another within that 10 second window, it will fulfill at the same exact time as the first request. Once the in-flight requests are fulfilled, the next request would take the full 10 seconds.

This has the drawback of a little additional complexity for requests, but it could dramatically reduce database load for uncached large tiles. If this is something the maintainers could want, I would be willing to work on a pull request to implement it. But if it is not, I'd rather not spend the time, so I'd much prefer a little discussion about it first.

nyurik commented 2 years ago

@saosebastiao I must say that my first reaction was - why not just use a reverse proxy, e.g. my personal favorite Varnish, with its grace mode:

When several clients are requesting the same page Varnish will send one request to the backend and place the others on hold while fetching one copy from the backend. In some products this is called request coalescing and Varnish does this automatically.

That said, if the cost of implementing and maintaining it is relatively low, and performance impact is negligible, I do think we should have this feature available as well. There are several aspects to consider:

I think this should be an opt-in compiler feature, at least at first. In the long term, we may make this as a user-configurable option (?)
we should have very clear measurable evaluation of the performance impact, especially for lots of non-repeating vs heavily repeating queries.
there is currently a lot of significant changes being done to the code (e.g. #440), but it seems this will be relatively simple to merge.

nyurik commented 2 years ago

P.S. I think you have a race condition in your code - the key could be removed by another thread between if(!cache.contains_key(&key)) and cache.get(&key).unwrap().

saosebastiao commented 2 years ago

I agree that my first response would probably be to use a cache. It seems like different caches use different names for this idea...for example, nginx calls this a cache lock, which is what I currently use for some of our sources.

However, there are some use cases where setting up a cache is not what I would want. For example:

If I don't want to deal with the performance or operational overhead of an extra process, container, etc.
If I don't want to deal with complex cache invalidation rules
If I don't want cached values at all, just the most recent data within reason

I agree that it should be an opt in feature. It should also be turned off for all function sources that might be marked volatile, which means we should get that information into our function sources at startup.

Since there are significant pending structural changes to the code, I think I'll hold off for a bit. Specifically, I'm gonna have to implement Hash + Eq for the components that make up a unique request (Source, Xyz, QueryParams), I think I'll wait until the Source implementation is more stable.

And yes, you were right, there is a race condition. I'll make sure to remedy that.

nyurik commented 9 months ago

@saosebastiao if still interested, there is now rudimentary LRU cache support in Martin, but a lot more can be done with it.

maplibre / martin

Performance Improvement: deduplicate in-flight requests #454