cloudflare / pingora

A library for building fast, reliable and evolvable network services.
Apache License 2.0
20.21k stars 1.1k forks source link

[Doc] using tracers to track connections #295

Open JosiahParry opened 1 week ago

JosiahParry commented 1 week ago

What is the problem your feature solves, or the need it fulfills?

I would like to use the CTX to track information on my proxied service. An important piece of this is knowing how many active connections on my service there are. That way I can spin up or down instances to support the demand.

The way I was thinking of doing this is storing a counter alongside my service. When a new CTX is made, I increment it. When a CTX is dropped, it is decremented.

Using a const won't work because I don't know how many services I'll he supporting at compile time.

Describe the solution you'd like

Have a method in ProxyHttp that is called on disconnect that has access to CTX and self

Describe alternatives you've considered

Using netstat on a loop to check connections to each port. Though that is messy and I'd like to have a built in way to track connections.

Additional context

This could include references to documentation or papers, prior art, screenshots, or benchmark results.

JosiahParry commented 1 week ago

It seems that maybe if http_cleanup() is exposed that could help?

eaufavor commented 1 week ago

Do you want to track connections (both in the connection pool and in use) or requests?

To track active connections, see https://github.com/cloudflare/pingora/issues/245#issuecomment-2129977269. To track active requests, as you said, just track the life time of a CTX.

To track non-constant number of services you can have something like, where the Key here is how you want to name/shard each service.

struct Counters(Arc<DashMap<Key, AtomicUsize>>)
JosiahParry commented 1 week ago

Thanks @eaufavor! I kind of follow where you're going with this—though not entirely.

Maybe i can be more specific. I have a struct which represents a single application. Note that I use a DashMap instead of a LoadBalancer because they are not mutable (see https://github.com/cloudflare/pingora/issues/291). The Child is a process serving on a port 0.0.0.0:{port}.

pub struct MyService {
    pub instances: DashMap<Backend, Child>,
    // This is not mutable so we cannot use it
    // pub load_balancer: LoadBalancer<RoundRobin>,
}

I have a struct which is struct TestProxy(pub Arc<MyService>); that I implement ProxyHttp for. Then, in the upstream_peer() method I select a random backend from the instances field to mimic a random assignment of a load balancer.

    async fn upstream_peer(&self, _session: &mut Session, _ctx: &mut ()) -> Result<Box<HttpPeer>> {
        let upstream = self.0.get_random_backend();  // custom method
        let peer = Box::new(HttpPeer::new(
            upstream,
            false,
            "one.one.one.one".to_string(),
        ));
        Ok(peer)
    }

For this, I need to know how many active connections there are to the specific backend as well as number of active requests. For example if there are 10 connections I should spin up another and add it to the DashMap. Or, if there are 1 active connections but the last request was 30 minutes ago, I should kill the session.

TL;DR i need to track:

The CTX approach doesn't work because there's not an event to track when the CTX is dropped.

Tracing approach

I'm not sure I understand the Tracing part enough. Would I have a single Tracer for each Backend that I would fetch and insert into the HttpPeer? I'm not sure how that would be accessible for a background check / synchronized across all backends.

Sorry if I'm being a bit daft!

eaufavor commented 1 week ago

To track the connections I will do the following

struct Counters(Arc<DashMap<Key, Usize>>)

struct Tracker(Arc<DashMap<Key, Usize>>, Key)

impl Counters {
     fn inc(&self, key: Key) -> Tracer {
           // inc the counter of the key. The syntax might not be correct but you get the idea.
           self.0.get_mut(&key) = self.0.get(&key).unwrap_or(0) + 1;
          Tracer(self.0.clone(), key)
     }
}

impl Drop for Tracer {
   fn drop(&self) {
      self.0.get_mut(&self.1) = self.0.get(&self.1).unwrap_or(0) - 1;
   }
}

In the above struct, if you call inc() on every request and put the returned Tracer in CTX, the Tracer will automatically decrease the counter when the request is finished (when the CTX is dropped).

Similarly, if you put the Tracer to the HttpPeer, it will be attached to the connection until the connection is closed (Tracer is dropped).

So in both cases we take advantage of the Drop event to track the number of connections.

github-actions[bot] commented 2 days ago

This question has been stale for a week. It will be closed in an additional day if not updated.

JosiahParry commented 1 day ago

I had modified the title to be more accurate.