cloudflare / pingora

A library for building fast, reliable and evolvable network services.
Apache License 2.0
20.82k stars 1.14k forks source link

Handle asset fetching from upstream #319

Closed JosiahParry closed 1 month ago

JosiahParry commented 1 month ago

I've created a DashMap<> that is used to fetch appropriate upstreams based in the upstream_peer() method of the proxy. The upstream is an interactive SPA served locally (using shiny if you're curious). Ideally these are sticky sessions.

I've noticed that when connecting to the session, there are a number of requests that are made by the apps to fetch assets eg javascript and css. Each one of these triggers a new CTX which is a bit surprising to me.

In the request_filter() stage, I fetch the app based on the slug. A few challenges with this is that the request header only tells me the referrer path from the proxy's perspective so i might not choose the same backend that one is presently connected to. And secondly, sometimes the referrer seems to just be a file which is quite confusing.

Is there a way so that the CTX is sticky? Or that I can determine where to send a request when the referer is the proxy? Please see the referer and URI from the log below

_request_filter() method _

```rs async fn request_filter(&self, session: &mut Session, ctx: &mut Self::CTX) -> Result { // Check if the referer header is set let has_ref = session.get_header("referer"); // If there is a referer header, we will use that to determine the slug to find the appropriate app // If there isn't extract the slug from the current Uri match has_ref { Some(referer_header) => { let referer_uri = referer_header .to_str() .expect("referer header is not a string") .parse::() .expect("referer header is not a valid uri"); // Sometimes this is actually just bootstrap or some other javascript library // causes problems when the referer isn't the upstream ctx.slug = Some(get_uri_slug(&referer_uri)); } None => { // Extract current Uri let cur_uri = &session.req_header().uri; let slug = cur_uri.path().split("/").nth(1).unwrap().to_string(); let app = self.0.get(&slug); ctx.slug = Some(slug); session.req_header_mut().set_uri( Uri::builder() .path_and_query(PathAndQuery::from_static("/")) .build() .unwrap(), ); if app.is_none() { let _ = session.respond_error(404).await; return Ok(true); } } } Ok(false) } ```

Pingora info

Please include the following information about your environment:

Pingora version: 0.2.0 Rust version: i.e. cargo 1.81.0-nightly Operating system version: MacOS M1 Sonoma

Expected results

I expect the referer to contain info about the service the request is coming from, or that requests from the upstream peer do not create a new CTX so that I can get the correct info from the initial CTX

Additional context

URI and referer from log

The last entry is an error because i pass session.client_addr().unwrap() to the HttpPeer which appears to be wrong

``` [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /jquery-3.6.0/jquery.min.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bootstrap-5.3.1/bootstrap.min.css Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /shiny-sass-1.8.1.1/shiny-sass.css Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /leaflet-1.3.1/leaflet.css Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /shiny-javascript-1.8.1.1/shiny.min.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /htmltools-fill-0.5.8.1/fill.css Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /leafletfix-1.0.0/leafletfix.css Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bslib-component-css-0.7.0/components.css Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /rstudio_leaflet-1.3.1/rstudio_leaflet.css Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bs3compat-0.7.0/transition.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bootstrap-5.3.1/bootstrap.bundle.min.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /htmlwidgets-1.6.4/htmlwidgets.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bs3compat-0.7.0/tabs.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bs3compat-0.7.0/bs3compat.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /leaflet-1.3.1/leaflet.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /Proj4Leaflet-1.0.1/proj4leaflet.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /proj4-2.6.2/proj4.min.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /leaflet-binding-2.2.2/leaflet.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bslib-tag-require-0.7.0/tag-require.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bslib-component-js-0.7.0/components.min.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bslib-component-js-0.7.0/web-components.min.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /bootstrap-5.3.1/font.css Referer:Some("http://localhost:6188/bootstrap-5.3.1/bootstrap.min.css") [2024-07-06T12:19:15Z ERROR pingora_proxy] Fail to proxy: Upstream ConnectRefused context: Fail to connect to addr: 127.0.0.1:60478, scheme: HTTP, cause: context: Fail to connect to 127.0.0.1:60478 cause: Connection refused (os error 61), status: 502, tries: 1, retry: false, GET /bootstrap-5.3.1/font.css, Host: localhost:6188 [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : / Referer:None [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /leaflet-providers-2.0.0/leaflet-providers_2.0.0.js Referer:Some("http://localhost:6188/reverse") [2024-07-06T12:19:15Z INFO ricochet::shiny::proxy] Uri : /leaflet-providers-plugin-2.2.2/leaflet-providers-plugin.js Referer:Some("http://localhost:6188/reverse") ```
eaufavor commented 1 month ago

In short, CTX is a way to share the state of a single request across its own phases. Each request will have its own CTX. CTX are independent from each other across different requests.

On the other hand, the problem you described is a typical session stickiness problem in world of proxies. The most robust way to solve it is to use set-cookie on the first request and read the cookie header to tell which requests are from the same client.

JosiahParry commented 1 month ago

Oh this is nifty! Thank you @eaufavor. Is set-cookie a crate?

eaufavor commented 1 month ago

I mean https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie. There are probably some crates to help you create/parse the set-cookie/cookie headers.

github-actions[bot] commented 1 month ago

This question has been stale for a week. It will be closed in an additional day if not updated.

github-actions[bot] commented 1 month ago

This issue has been closed because it has been stalled with no activity.