Allow instrumentation of client-side `fetch` via a `handleFetch` hook

Lms24 commented 1 year ago

Describe the problem

TL;DR: We (Sentry) want to instrument client-side fetch requests and we currently cannot instrument some of them. We propose a handleFetch hook on the client side as a solution.

Hi, I'm an SDK engineer at Sentry and we're currently working on a dedicated SvelteKit SDK. Things are progressing nicely but we just discovered a potential problem:

Based on my findings, SvelteKit stores window.fetch in the native_fetch variable, monkey patches it and only uses this one for client-side fetch requests: https://github.com/sveltejs/kit/blob/5191e165b75f5616bd6b4aadf9d8a6c52e75c562/packages/kit/src/runtime/client/fetcher.js#L6

This is problematic for our use case because it means that our SDK, which we'll most probably instruct users to initialize in the client and server hooks files comes in too late to apply our fetch instrumentation to window.fetch.

We already wrap load functions to create spans for our performance monitoring product. So we have a way to instrument the fetch function that is passed to the client-side universal load function in the LoadEvent.

However, for a page that has a server-only load function, Kit makes a route/path/__data.json fetch request to invoke this server-only load function. Up to this point, we haven't found a way to instrument this fetch call. We would really like to instrument the call though, so that we can attach our Http headers for distributed tracing (see Additinal Information below). Without this, we cannot connect the client side with the server side and our users don't get the full picture what happened during a client-side navigation.

Please find our proposed solution(s) below. We're more than happy to open a PR or to work with you in any way to help finding a good solution here. Also, if it happens that we missed something, please let us know :)

Describe the proposed solution

To solve this, we propose to add support for a handleFetch hook in the hooks.client.js file. This hook would be invoked every time a fetch call is made on the client side, be it by a user calling fetch inside a load function as well as by the framework, when it makes a fetch call to invoke a server-only load function.

We believe this would be the cleanest approach to open up Kit's fetch to SDKs like ours. It would also make it easier for Kit users to e.g. attach custom Http headers or other data to outgoing fetch requests.

Since I've already spent some time reading Kit source code and I already played around with hooks, I'm more than happy to open a PR for this change :)

Alternatives considered

We have considered a few alternatives that we believe would also work but are potentially less clean or might require more SvelteKit-internal changes:

Make native_fetch (or better initial_fetch and subsequent_fetch) patchable for SDKs in another way than via a client-side handleFetch hook. We're definitely open to other suggestions but right now, everything we hook into in SvelteKit would reside in the hooks files which means great DX for our users as hooks seem to be the most SvelteKit-native way.
SvelteKit could monkey-patch window.fetch directly without storing it away in variables. This way, our instrumentation can wrap the correct fetch without any additional code on our end. Given how Kit currently handles fetch, I believe this is unlikely to happen but it would nevertheless be an option.
If there's no way how we can instrument all fetch calls during runtime, we'll need to inject some code at build-time, probably via a Vite plugin to be able to access native_fetch to instrument it. We've done this for other framework SDKs before but would like to avoid it if at all possible. It's a rather brittle solution and has caused a lot of bugs in said framework SDKs in the past. That being said, we'll go down this route if there's no other way.

Importance

would make my life easier; very important for SvelteKit Sentry users

Additional Information

A little more background on why we need to instrument fetch:

Sentry SDKs add distributed tracing headers (our proprietary sentry-trace as well as the W3C standardized baggage headers) to outgoing fetch (and XHR) requests. With the help of these headers we can connect transactions and spans in our performance monitoring product, giving our users the full picture of what happened on the client and on the server (+ additional backend services).
Sentry SDKs collect "breadcrumbs" for fetch requests to give our users clues what happened e.g. before an error was reported to Sentry.

Thank you for considering this issue!

dummdidumm commented 1 year ago

To clarify, would handleFetch run for all fetch requests, or only those from SvelteKit's fetch. In other words, would handleFetch only be invoked for these..

// +page.js
export function load({ fetch }) {
  fetch(..)
}

...or also for these

// +page.js
export function load() {
  fetch(..); // this is the global fetch
}

<script>
  function fetchSomething() {
    fetch(..); // this is the global fetch
  }
</script>

?

Lms24 commented 1 year ago

For our use case it's enough that it runs for SvelteKit's fetch because we already instrument the global fetch. I guess this is probably more coherent with the server's handleFetch hook, too. However, I'd like to stress that it should also run for framework-internal fetch request, like the __data.json requests.

If you think that it should also run for the global fetch, that would be totally fine for us as well.

theetrain commented 1 year ago

In addition to my +1 to this feature suggestion, I'd like to share my use case forwarding headers to fetch provided by LoadEvent.

My application uses header auth, and I want to forward auth headers into every fetch request made within a load function.

I currently retrieve auth headers from locals in my handle hook and pass them into fetch manually:

export async function load ({ fetch, locals }) {
  const headers = new Headers({
    accept: 'application/json'
  })
  if (locals.user.name) {
    headers.set('x-saml-displayname', locals.user.name)
  }
  if (locals.user.email) {
    headers.set('x-saml-email', locals.user.email)
  }
  if (locals.user.username) {
    headers.set('x-saml-username', locals.user.username)
  }

  const res = await fetch('/api/path', {
    headers
  })
}

It would be nice to easily forward headers onto the server-side LoadEvent's fetch. As a workaround I can forward headers manually or create a wrapper for fetch.

Thanks for your consideration, please let me know if this needs its own issue.

Lms24 commented 1 year ago

We were just notified that our current fetch patching on the client side causes cache misses for fetch requests made during SSR that are stored and sent to the client via data-sveltekit-fetched tags. Basically, this happens because:

during SSR, fetch requests are made on the server and the result is stored in a data-sveltekit-fetched script tag
On the client, during hydration, SvelteKit's fetch function tries to obtain the data from the "cache" that's populated from the script tag(s) by matching the request.
- For the request in question, there's no cache hit because we do in fact modify the fetch request by adding our sentry-trace and baggage headers to it. Headers influence the cache key and hence there's a cache miss :(
- As a result, the data is actually fetched

I believe this would yet be another good reason to expose a clean way to intercept and add headers to fetch requests on the client side. I imagine we can hardly be the only ones who'd like to automatically add headers to all outgoing requests and this could mess up caching in more use cases than ours.

The alternative solution for us would be to preemptively create these script selectors in our patched fetch instrumentation and check if there would be a cache hit (basically, recreate a good part of fetchers.js in the SDK client code). In this case we'd only attach headers if there's no hit anticipated. However, this solution has a couple of drawbacks:

it's super brittle, for example as soon as the hashing seed or the selector generation is changed in Kit, it would no longer work
it substantially increases the client bundle size of the SDK, as we need to include the selector creation code.

I'm definitely open for other ideas how to get around the cache misses here but IMHO this needs to be solved in the framework in the long-run. Also, I'm more than happy to remove our current fetch patching in favour of a handleFetch hook :)

This reproduction example shows the current problem and how/when cache-misses occur

myieye commented 1 year ago

For what it's worth, this is how we're currently instrumenting SvelteKit's fetch using OTEL:

In index.html (i.e. before SK has touched fetch) we just put a proxy in place:

const _fetch = window.fetch;
window.fetchProxy = (...args) => _fetch(...args);
window.fetch = (...args) => window.fetchProxy(...args);

Some utils for cleanly instrumenting the proxy:

// Wraps fetch with the provided handler
export const handleFetch = (fetchHandler: (input: { fetch: typeof fetch, args: Parameters<typeof fetch> }) => Promise<Response>): void => {
  instrumentGlobalFetch(() => { // wrapping simply abstracts away our proxy
    const currProxy = window.fetch;
    window.fetch = (...args) => fetchHandler({ fetch: currProxy, args });
  });
};

// Runs instrumentation that operates on the global fetch
export const instrumentGlobalFetch = (instrument: () => void): void => {
  const currFetch = window.fetch;
  window.fetch = window.fetchProxy; // Put our proxy in place to be instrumented
  try {
    instrument();
    window.fetchProxy = window.fetch; // Put our (now) instrumented proxy back into place
  } finally {
    window.fetch = currFetch;
  }
}

Which allows us to make our own handleFetch:

handleFetch(async ({ fetch, args }) => {
  const response = await traceFetch(() => fetch(...args));
  if (response.status === 401) {
    throw redirect(307, '/logout');
  }
  return response;
});

And OTEL can do it's auto-instrumentation too:

instrumentGlobalFetch(() => {
  registerInstrumentations({
    instrumentations: [getWebAutoInstrumentations()],
  });
});

Lms24 commented 1 year ago

@myieye this is pretty neat! I just played around with your idea and I think it could work for Sentry. Thanks a lot for sharing 🙏

pzuraq commented 1 year ago

FWIW that solution requires you to add unsafe-inline to your CSP I believe, unless you import it as a script directly. We had to put our script up on our CDN so we could just add it via a script tag:

<script type="text/javascript" src="https://cdn.bitskistatic.com/js/telemetry-fetch.js"></script>

I really agree this feature is very important for generic instrumentation. Currently we have a lot of these same issues, and the hacks around it are really quite annoying.

@Lms24 I see your potential solution to this problem, we may switch to something like that, but ideally we would just add the handleFetch hook so we can customize this in the framework itself.

Lms24 commented 1 year ago

@pzuraq yup, CSP is a problem (see https://github.com/getsentry/sentry-javascript/issues/8925). I suggested a workaround to the solution for Sentry users but still waiting on feedback to implement.

The more I think about it, the more I come to the conclusion that the cleanest way forward to permit fetch instrumentation would just be my alternative suggestion no. 2 (i.e. make Kit not store away the fetch version it uses in its loaders). I took a stab at this in https://github.com/sveltejs/kit/pull/10009 but seems like there hasn't been much movement lately.

To be clear, this doesn't provide the same functionality as a dedicated handleFetch hook but it would be the best solution to make instrumentations like Sentry or OpenTelemetry work without any kind of proxy/script hack. Cause even if we provided a handleFetch hook, any instrumentation library would need to expose a handler that can actually be used with this hook. While we might do this at Sentry, I heavily doubt that this is going into Otel for example, which afaik doesn't provide much framework-specific instrumentation.

pzuraq commented 1 year ago

Yeah I agree, that solution would be the most "be like the platform" one, and also would make sure that client-side load can work with any arbitrary tooling that happens to be out there that patches fetch. I dislike monkeypatching in general like this, but it is the common pattern, so I think enabling it while figuring out a better handleFetch story make sense.

jdgamble555 commented 11 months ago

The window fetch though is different than SvelteKit's fetch. I would think the simple answer would be to add the ability to just use hooks.ts like page.ts:

hooks.server.ts
hooks.ts

import type { HandleFetch } from '@sveltejs/kit';

export const handleFetch = (async ({ request, fetch }) => {
    ...
    return fetch(request);
}) satisfies HandleFetch;

If it is server, it calls hooks.server.ts, if it is client, it calls hooks.ts. I honestly thought it worked like that, when it didn't I found this thread...

J

sbscan commented 6 months ago

sveltekit needs client side handlefetch which is interceptor in axios. Svelte makes our life easier, but not this time. It's very simple with axios interceptors. We should have some control over client fetchs including page loads.

gemue-parndt commented 1 month ago

This would be a really nice feature - on frontend only we can use the proxy of nginx or caddy or whatever the website is hosted with. But in case we wanna use SvelteKit we can't proxy the client side fetches

sveltejs / kit