splitgraph / seafowl

Analytical database for data-driven Web applications 🪶
https://seafowl.io
Apache License 2.0
390 stars 9 forks source link

Return timing data in response header(s) #438

Closed onpaws closed 1 year ago

onpaws commented 1 year ago

Seeks to implement #357

[1] https://www.w3.org/TR/server-timing/ [2] https://arrow.apache.org/datafusion/user-guide/sql/explain.html

gruuya commented 1 year ago

Fwiw, I don't think this should be configurable (not for now at least); there are more pertinent knobs that still aren't customizable (e.g. see #50), and I'm wary of bloating up the config with less important items.

I think just extending the existing tests should be sufficient to merge this.

onpaws commented 1 year ago

Fwiw, I don't think this should be configurable (not for now at least); there are more pertinent knobs that still aren't customizable (e.g. see #50), and I'm wary of bloating up the config with less important items.

I think just extending the existing tests should be sufficient to merge this.

Thanks Marko, yes probably too soon to add new config options. In the future depending how we want to allow user customization e.g. once we return >1 timing measurement, it will presumably be time to consider.

I will switch to X-Seafowl-Query-Time, unless anyone objects, thanks! In terms of answering "when should Seafowl return this header", it seems to me we have two options:

  1. Always
  2. Return only in certain conditions, e.g. if the user-agent passes e.g. Seafowl-Trace: true in the request

IIUC I think @mildbyte prefers option 1, and @milesrichardson prefers option 2. (at least, I had that impression based on MM chats.)

What do you think @gruuya?

milesrichardson commented 1 year ago

I would prefer option 1 (always) too. But just note that it won't always be available to client code in a cross-origin browsing context unless the header is included in the CSV value of Access-Control-Expose-Headers.

But I'm hesitant to always return Access-Control-Expose-Headers. Perhaps that could be opt-in via a request flag. Or perhaps it's not a problem to always include it anyway, since any problem it would cause for someone running a proxy in front of Seafowl, would only be relevant when they wanted the header anyway - at which point they could add it to their own code where they specify exposed headers (that is, the relatively harmless failure case would be their own headers overriding a Seafowl header that's meant to be opt-in anyway).

I had originally said that the timing data shouldn't go into the Content-Type header alongside the schema. And it certainly feels like it would be overloading the semantic purpose of that header. But, after thinking more on it, I'm not convinced it's a problem to include timing data in addition to the schema. At first I thought it would be wrong because you don't want to cache it, but actually it makes sense to cache it, because it's timing how long it took for the origin Seafowl instance to execute the original query and return the result that was ultimately saved to the cache. So it's totally sensible for the timing data to be included in the cached response that's served by Cloudflare, or even from your browser, since neither of those are re-executing the query, and the time Seafowl originally spent executing the query is still a relevant metric.

So another question would be: what response headers does Cloudflare save in its cached objects? We know that it saves Content-Type, which is maybe a point in favor of including timing data there too. But I don't know what it does with other response headers (I assume it defaults to saving them?)

@gruuya @mildbyte What do you think about including the timing data in the Content-Type header along with the schema?

onpaws commented 1 year ago

@gruuya

I think just extending the existing tests should be sufficient to merge this.

Is this what you had in mind? Wasn't sure if you meant literally extending some already existing test, or adding to the existing test suite.

If there's some recommended way to hit both cached_.../uncached_ paths from one single test, that seems nice to avoid what would likely otherwise be code duplication. I noticed certain macros e.g. #[case::uncached_post("POST", "/q")] that suggest maybe we could use uncached_get and cached_get to test both paths from a single test? Not sure if I'm understanding that correctly though. image

gruuya commented 1 year ago

One thing I just learned is that apparently even default CORS safelisted headers have some prerequisites:

For Content-Type: needs to have a MIME type of its parsed value (ignoring parameters) of either application/x-www-form-urlencoded, multipart/form-data, or text/plain.

In our case we have application/json so it seems we won't gain any benefit from shoehorning the timing data into content-type anyway.

@milesrichardson given this, and the fact that @onpaws discovered some handy DataFusion-provided internal metrics (which we may want to supplement along with the request-to-response time as is done currently), I'd vote for going with a separate dedicated header (assuming that it is cached by CF).

gruuya commented 1 year ago

@onpaws you could also simply extend the last test in that module; it hits both GET and POSTS endpoints. In addition to presence, I'd test that the value of the header is as expected (i.e. something that can be parsed as a float, between 0 and 1).

mildbyte commented 1 year ago

Cloudflare seems to cache a lot of response headers, if not all of them. This screenshot is from one of our pages with the cache disabled in Firefox and with a Cloudflare cache hit, showing X-Cache and X-Varnish headers (that are added by our own Varnish) being returned by Cloudflare in its cached response:

image

milesrichardson commented 1 year ago

@gruuya

One thing I just learned is that apparently even default CORS safelisted headers have some prerequisites:

Read that a bit more carefully: those restrictions are for the request content-type header and do not apply to the response. (I made the same mistake the first two times I read it in the original issue 😸)

gruuya commented 1 year ago

Oh damn, good catch :D

onpaws commented 1 year ago

@onpaws you could also simply extend the last test in that module; it hits both GET and POSTS endpoints. In addition to presence, I'd test that the value of the header is as expected (i.e. something that can be parsed as a float, between 0 and 1).

Ended up doing this approach, cheers. Now we append it to the existing test hopefully in the way you desired. While I was at it I renamed it to test_http_type_conversion_and_timing_header which is perhaps a bit long but hopefully makes things clear for future us/contributors.

Agreed checking the header is "as expected" is worthwhile and added a second test for that. I wasn't initially sure the lines of responsibility/best ways to handle weird edge cases when parsing floats from strings. Along the way learned Rust core now uses Eisel-Lemire. Infinity and NaN scenarios were first a concern but IIUC it seems like .is_finite() handles both, which I believe should cover us. If I missed any cases please lmk.

Verified the tests work by temporarily commenting out the header injection code, and re-running, and got the expected result.

before/after the "before" test run, expected to fail b/c I commented out the header code ``` running 2 tests test frontend::http::tests::test_http_type_conversion_and_timing_header::case_2_uncached_post ... FAILED test frontend::http::tests::test_http_type_conversion_and_timing_header::case_1_cached_get ... FAILED failures: ---- frontend::http::tests::test_http_type_conversion_and_timing_header::case_2_uncached_post stdout ---- thread 'frontend::http::tests::test_http_type_conversion_and_timing_header::case_2_uncached_post' panicked at 'assertion failed: resp.headers().contains_key(QUERY_TIME_HEADER)', src/frontend/http.rs:1596:9 ---- frontend::http::tests::test_http_type_conversion_and_timing_header::case_1_cached_get stdout ---- thread 'frontend::http::tests::test_http_type_conversion_and_timing_header::case_1_cached_get' panicked at 'assertion failed: resp.headers().contains_key(QUERY_TIME_HEADER)', src/frontend/http.rs:1596:9 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace failures: frontend::http::tests::test_http_type_conversion_and_timing_header::case_1_cached_get frontend::http::tests::test_http_type_conversion_and_timing_header::case_2_uncached_post test result: FAILED. 0 passed; 2 failed; 0 ignored; 0 measured; 188 filtered out; finished in 0.04s ``` the "after" run ``` running 2 tests test frontend::http::tests::test_http_type_conversion_and_timing_header::case_2_uncached_post ... ok test frontend::http::tests::test_http_type_conversion_and_timing_header::case_1_cached_get ... ok test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 188 filtered out; finished in 0.04s Running unittests src/main.rs (target/debug/deps/seafowl-49a42c42648f96f5) running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Running tests/main.rs (target/debug/deps/main-084a7a60e84333ec) running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 56 filtered out; finished in 0.00s ```
onpaws commented 1 year ago

CI passed and it seems we've probably reviewed this PR sufficiently. Any objections to merging?

milesrichardson commented 1 year ago

I haven't reviewed the code (and don't intend to), but once you merge it, can you update the docs in the marketing repo to reference the header?

onpaws commented 1 year ago

I haven't reviewed the code (and don't intend to), but once you merge it, can you update the docs in the marketing repo to reference the header?

Sure! How do you feel about e.g. this image

with some basic X-Seafowl-Query-Time example similar to the other ones on this page?

onpaws commented 1 year ago

Thanks Marko! Merging now