podium-lib / client

Client for fetching Podium component fragments over HTTP.
MIT License
7 stars 6 forks source link

feat: receive assets from podlet as 103 early hints #410

Closed digitalsadhu closed 1 week ago

digitalsadhu commented 3 weeks ago

Draft PR for early feedback. I'm interested in feedback on all areas but especially the API surface which currently looks like this:

const headerClient = layout.client.register(...);

// client now event emits assets as they are received. 
headerClient.on('assets', assets);
...
const response = headerClient.fetch(...);

I'm not sure if this is suitable since the user will need access to assets scoped specifically to the request they are making and within the request handler context they are in...

const headerClient = layout.client.register(...);

// client now event emits assets as they are received. 
headerClient.on('assets', assets);

app.get("/", (req, res) => {

  // dont have access to assets here
  // could do headerClient.once('assets', assets); perhaps.
  // still, I'm not sure the scoping we want/need is there.

  const response = await headerClient.fetch(...);
});

Another, less elegant API would be to add a callback to the fetch method options:

const headerClient = layout.client.register(...);

app.get("/", (req, res) => {
  const response = await headerClient.fetch(incoming, {
    onAssets(assets) {
      res.writeEarlyHints({
        // in practice you'd need to rewrite the assets to preloads in the layout rather than just forward like this.
        link: asserts.map(asset => asset.toHeader()).join(',')
      })
    }
  });
});

One final idea from me is that an event emitted could be scoped for each request (which is probably not feasible since It would likely be breaking)

const headerClient = layout.client.register(...);

app.get("/", (req, res) => {
  const res = headerClient.fetch(incoming);
  res.on('assets', assets);
  const content = await res.content();
});

EDIT:

Reworked the PR so that:

Early hints are received from the podlet, parsed and then resent to the browser as preload early hints the assets received from the podlet via early hints and used in the resource.fetch response

wkillerud commented 3 weeks ago

Saving the notes from our scetching to reduce boilerplate, going the callback direction.

// happy case
app.get("/", async (req, res) => {
  const incoming = res.locals.podium;

  const { header, body, footer } = await layout.client.fetch(incoming, {
    header: {
      foo: 'bar',
    },
    body: true,
    footer: true,
  });
  // layout.client.fetch should do this for us :point-down:
  layout.podlets = [header, body, footer];

  return res.podiumSend(`${header}${body}${footer}`);
});

// scrappy case
app.get("/", async (req, res) => {
  const incoming = res.locals.podium;

  const { header, body, footer } = await layout.client.fetch(incoming, {
    header: contidion ? {
      foo: 'bar',
    } : false,
    body: true,
    footer: true,
  }, {
    onEarlyHints(links) {
      res.whatever(link);
    },
  });
  // layout.fetch should do this for us :point-down:
  layout.podlets = [header, body, footer];

  return res.podiumSend(`${header}${body}${footer}`);
});
wkillerud commented 3 weeks ago

Some more notes, array variant.

// happy case
app.get("/", async (req, res) => {
  const incoming = res.locals.podium;

  const { header, body, footer } = await layout.client.fetch(incoming, {
    [headerClient.name]: true,
    broadcast: true,
    consents: {
      query: {
        userAgent: req.header("User-Agent"),
        notLoggedInUserId: req.cookies && req.cookies.USERID,
      },
    },
    footer: true,
  });

  return res.podiumSend(`${header}${body}${footer}`);
});

// array case
app.get("/", async (req, res) => {
  const incoming = res.locals.podium;

  const { header, body, footer } = await layout.client.fetch(incoming, [
    headerClient.name,
    "broadcast",
    [
      "consents",
      {
        query: {
          userAgent: req.header("User-Agent"),
          notLoggedInUserId: req.cookies && req.cookies.USERID,
        },
      },
    ],
    "footer",
  ]);

  return res.podiumSend(`${header}${body}${footer}`);
});
digitalsadhu commented 3 weeks ago

Reworked the PR so that:

  1. Early hints are received from the podlet, parsed and then resent to the browser as preload early hints
  2. the assets received from the podlet via early hints and used in the resource.fetch response
trygve-lie commented 3 weeks ago

So Early Hints are kinda ideas which I have put in the box of next major release of Podium. Iow; breaking. We can probably make it non breaking but it will require some code for handling backward compatibility. And we need to think about order to roll this out.

These are thoughts I've had on this topic for a while:

The main reason to use Early Hints between the layout and podlets is to be able to remove the complex logic for dealing with matching the assets of a podlet with the podlets content in the scenario of re-deployment of a podlet. There is a pretty nasty retry logic going on to deal with this as it is now and being able to hang a podlets assets on the same request as the content of the podlet instead of having it in the manifest makes it possible for us to remove all that complexity.

With that in mind;

First of all podlets must send assets as Early Hints asap when they get an request. I think that sending the assets as Early Hints is something the middleware in the podlet do. Its not something the developer deals with. So; I imagine the API surface like so (pretty much untouched):

const podlet = new Podlet({
  name: 'myPodlet',
  version: '1.0.0',
});

podlet.js({ value: 'https://...../file.js' });

app.register(fastifyPodlet, podlet);

app.get(podlet.content(), async (request, reply) => {
  reply.podiumSend('<h2>Hello world</h2>');
});

Currently the process of dealing with assets in Podium is that they are handed to the layout through the manifest. From the perspective of performance and sending Early Hints to the browser this is actually ideal since we can always send off the Early Hints headers and start pushing out the html document even before the layout have started requesting the podlets. This will always be maximum speed.

But; by going with Early Hints between the podlets and layouts we kinda loose that a bit. By doing Early Hints between the podlets and the layouts we will have to make a request and get headers from the podlets before we can start sending Early Hint headers to the browser. There is no way around that.

That's the reason why its important that Podlets send Early Hints off asap. By doing so we minimize the time before we start sending something in to the browser from the layout. We loose a little bit over keeping the assets in memory at all time but I think that's fine compared to how things are done on average today.

In the layout I do also imagine that sending off Early Hints to the browser is abstracted away from the developer. Its just done under the hood.

I kinda think that the API surface in a layout is not any different than it is today to be honest. Though; under the hood I think the following needs to happen:

Lets say a layout have 4 podlets registered. When a request comes in and the .fetch() methods is called we count how many podlets are called (in our case 4 times). For each Early Hint we get (one for each podlet) we push an Early Hint on the layout to the browser and keep track of how many Early Hints we have gotten (we can assume Early Hints is emitted only once pr podlet).

For each Early Hint we get, we also push the assets to an array of assets (there is one internally in the layouts).

When the amount of Early Hints we have gotten matched with the amount of .fetch() calls there where (in our case 4) we know that we have gotten all assets from all podlets. At this point in time we can safely start sending the html document (at this point we can choose to do html streaming too if we want).

The html document will then use the array of assets we built up to produce the needed links to the assets in the document.

This does not cater for trying to send different references to assets pr brand from a podlet. But I am not sure we need that from a podlet.

In our case, assets is built once and uploaded to our CDN once and available on different hosts. Iow; the assets and at what path structure they live on are identical for all brands but the host is different. We should be able to have one reference for assets in a podlet and cater for different brands in the layout.

Something like this could be done when the layout gets the Early Hints from the podlets:

const source = new URL('https://assets.finn.no/js/script.js');
const blocket = new URL(uri.pathname, 'https://assets.blocket.se/');
const tori = new URL(uri.pathname, 'https://assets.tori.fi/');
const dba = new URL(uri.pathname, 'https://assets.dba.dk/');

If we really need to set different assets pr request in the podlets I kinda imagine an API like so or something:

app.get(podlet.content(), async (request, reply) => {
  reply.podiumAssets([{ value: 'https://...../file.js' }]);

  reply.podiumSend('<h2>Hello world</h2>');
});
digitalsadhu commented 3 weeks ago

So Early Hints are kinda ideas which I have put in the box of next major release of Podium. Iow; breaking. We can probably make it non breaking but it will require some code for handling backward compatibility. And we need to think about order to roll this out.

These are thoughts I've had on this topic for a while:

The main reason to use Early Hints between the layout and podlets is to be able to remove the complex logic for dealing with matching the assets of a podlet with the podlets content in the scenario of re-deployment of a podlet. There is a pretty nasty retry logic going on to deal with this as it is now and being able to hang a podlets assets on the same request as the content of the podlet instead of having it in the manifest makes it possible for us to remove all that complexity.

With that in mind;

First of all podlets must send assets as Early Hints asap when they get an request. I think that sending the assets as Early Hints is something the middleware in the podlet do. Its not something the developer deals with. So; I imagine the API surface like so (pretty much untouched):

This is now the case on the Podlet next branch/channel after this PR was merged https://github.com/podium-lib/podlet/pull/410

const podlet = new Podlet({
  name: 'myPodlet',
  version: '1.0.0',
});

podlet.js({ value: 'https://...../file.js' });

app.register(fastifyPodlet, podlet);

app.get(podlet.content(), async (request, reply) => {
  reply.podiumSend('<h2>Hello world</h2>');
});

Currently the process of dealing with assets in Podium is that they are handed to the layout through the manifest. From the perspective of performance and sending Early Hints to the browser this is actually ideal since we can always send off the Early Hints headers and start pushing out the html document even before the layout have started requesting the podlets. This will always be maximum speed.

But; by going with Early Hints between the podlets and layouts we kinda loose that a bit. By doing Early Hints between the podlets and the layouts we will have to make a request and get headers from the podlets before we can start sending Early Hint headers to the browser. There is no way around that.

That's the reason why its important that Podlets send Early Hints off asap. By doing so we minimize the time before we start sending something in to the browser from the layout. We loose a little bit over keeping the assets in memory at all time but I think that's fine compared to how things are done on average today.

In the layout I do also imagine that sending off Early Hints to the browser is abstracted away from the developer. Its just done under the hood.

I kinda think that the API surface in a layout is not any different than it is today to be honest. Though; under the hood I think the following needs to happen:

Lets say a layout have 4 podlets registered. When a request comes in and the .fetch() methods is called we count how many podlets are called (in our case 4 times). For each Early Hint we get (one for each podlet) we push an Early Hint on the layout to the browser and keep track of how many Early Hints we have gotten (we can assume Early Hints is emitted only once pr podlet).

For each Early Hint we get, we also push the assets to an array of assets (there is one internally in the layouts).

When the amount of Early Hints we have gotten matched with the amount of .fetch() calls there where (in our case 4) we know that we have gotten all assets from all podlets. At this point in time we can safely start sending the html document (at this point we can choose to do html streaming too if we want).

The html document will then use the array of assets we built up to produce the needed links to the assets in the document.

This does not cater for trying to send different references to assets pr brand from a podlet. But I am not sure we need that from a podlet.

In our case, assets is built once and uploaded to our CDN once and available on different hosts. Iow; the assets and at what path structure they live on are identical for all brands but the host is different. We should be able to have one reference for assets in a podlet and cater for different brands in the layout.

Something like this could be done when the layout gets the Early Hints from the podlets:

const source = new URL('https://assets.finn.no/js/script.js');
const blocket = new URL(uri.pathname, 'https://assets.blocket.se/');
const tori = new URL(uri.pathname, 'https://assets.tori.fi/');
const dba = new URL(uri.pathname, 'https://assets.dba.dk/');

The trouble with this as I see it is that it's already too late by this stage. I think we want to send of preload early hints from the layout to the browser the second they are received from each podlet rather than wait and gather them. up. As I understand it, early hints allows for multiple early hints to be sent and there's nothing in the way for just sending an early hint for each podlet. Eg. if we have 4 podlets, the layout will receive and early hint for each and immediately transform the hint to asset preload hints and forward them on to the browser (this is what this PR currently does).

It's these preload tags which need to go out before the user has any chance to adjust them that makes the replacement solution tricky.

I agree with what you've said above that we should then also gather up those early hinted podlet assets and once all 4 have been gathered, we can start sending the document placing CSS assets in the head while still waiting for the rest of the podlet content to arrive.

If we really need to set different assets pr request in the podlets I kinda imagine an API like so or something:

app.get(podlet.content(), async (request, reply) => {
  reply.podiumAssets([{ value: 'https://...../file.js' }]);

  reply.podiumSend('<h2>Hello world</h2>');
});

I like the look and feel of this API, just wonder how much we'd be sacrificing by having it this late (current podlet solution does it as part of the podlet middleware). This could be an issue with things like user token middleware that sends off a request to an external service and would happen before the early hints.

digitalsadhu commented 3 weeks ago

Another issue I've spotted is that we currently provide a scope feature that allows a podlet to send over assets for content or assets for fallbacks and its entirely possible that the podlet may send these assets using early hints before choking and dying on the content request. Inside the layout therefore, we can't know which assets to filter by scope until it's too late. In other words, we have to wait for the content route response to see if it succeeded or failed, before we can decide whether assets should be content assets or fallback assets. We can't receive the assets, gather them up, generate preload hints etc etc until the fetch has completed. This totally defeats the purpose for early hints. Any way around this? One drastic option is to drop asset scopes altogether in favour of the perf win early hints and streaming can give us. This would mean podlets could no longer provide a different set of assets for content and fallback.

If we were to do away with content|fallback scopes, we could use the res.podiumAssets method in the content and the fallback routes to differentiate the 2 scopes instead.

app.get(podlet.content(), async (request, reply) => {
  reply.podiumAssets([{ value: 'https://...../content.js' }]);

  reply.podiumSend('<h2>Hello content</h2>');
});

app.get(podlet.fallback(), async (request, reply) => {
  reply.podiumAssets([{ value: 'https://...../fallback.js' }]);

  reply.podiumSend('<h2>Hello fallback</h2>');
});

Though even then, it doesn't really solve the issue. I can't see a way to start streaming the document early if one of the podlets might fail and suddenly need to swap which assets it wants placed in the browser. No scopes and same assets for content and fallback seems the only real solution to me.

trygve-lie commented 3 weeks ago

The trouble with this as I see it is that it's already too late by this stage.

I don't think so. Or; I don see how that is too late.

I think we want to send of preload early hints from the layout to the browser the second they are received from each podlet rather than wait and gather them. up. As I understand it, early hints allows for multiple early hints to be sent and there's nothing in the way for just sending an early hint for each podlet. Eg. if we have 4 podlets, the layout will receive and early hint for each and immediately transform the hint to asset preload hints and forward them on to the browser (this is what this PR currently does).

From the Layout to the browser we want to send off Early Hints as fast as possible. This is how I see it (given 4 podlets):

Podlets ALWAYS send one Early Hint. They send all their assets in one go. The main purpose for Podlets sending Early Hint is that its our protocol to communicate assets to a Layout.

When a request comes in to a Layout, the Layout makes a request to all 4 Podlets. Podlet 1 is fastest and sends a Early Hint with all its assets. Layout receive this. On receiving this it transforms the URLs based on brand it is like so-ish:

const source = new URL('https://assets.finn.no/js/script.js');
const dest = new URL(uri.pathname, 'https://assets.blocket.se/');

Then the Layout write a Early Hint with pre-load tags and sends off that to the browser. When that's done it store the asset references in an Array.

Podlet 3 so responds with a Early Hint. Layout then repeat the above stem transforming the asset references, writing a second Early Hint with pre-load tag to the browser and then stores the asset references in the Array of asset references.

This is repeated until all Podlets have responded with Early Hints. When all Podlets have done so; the Layout can continue to send the HTML document. If we want to stream the document we can do so since we have an Array with the asset references which we need to build the HTML references in the document. Content from each Podlet can then be pushed on the body stream in the Layout to the browser as each Podlet responds with the body.

It's these preload tags which need to go out before the user has any chance to adjust them that makes the replacement solution tricky.

I don't think this is something the end user should manually do. We are probably missing an API (or config) which is for defining that assets can be over there at a CDN (like assetPrefix or something which one have in a lot of other systems).

I like the look and feel of this API, just wonder how much we'd be sacrificing by having it this late (current podlet solution does it as part of the podlet middleware).

The danger whith this is that developers can do something like this;

app.get(podlet.content(), async (request, reply) => {

  const data = fetch('http://super-slow-service.com');

  // A lot of other slow stuff

  reply.podiumAssets([{ value: 'https://...../file.js' }]);
  reply.podiumSend('<h2>Hello world</h2>');
});

If they don't one are not sacrificing anything.

trygve-lie commented 3 weeks ago

Another issue I've spotted is that we currently provide a scope feature that allows a podlet to send over assets for content or assets for fallbacks and its entirely possible that the podlet may send these assets using early hints before choking and dying on the content request. Inside the layout therefore, we can't know which assets to filter by scope until it's too late.

I'm not sure this is a problem. I imagine that the podlet also sends its assets as Early Hints. fallbacks are a hard cached content we hold in the Layouts until a new indication of a new fallback should be fetched. fallbacks is meant to be as static as possible and act as a fallback when stuff goes wrong so we hold them in memory in the Layout and do not re-fetch them on each request. They have other update logic bound to them. In this; fallbacks should hold it related asset references.

If we're adding HTML streaming into the loop, then the fallback might not be a fallback as it is to day where its an alternative content we show when (iow; after) something fails. If we throw HTML streaming into the mix the fallback is more a placeholder which is shown until the content is streamed into place. In this case the first ting one want to show is the placeholder, then when the content is ready it replaces the placeholder. In case of the content failing it will just never replace the placeholder. There might be an even which alter the placeholder to show a fetch error or something though but that logic should belong in the placeholder.

In the case on NON HTML streaming I don't think there is any value in pre-sending the assets of the fallback to the browser. If the content of one podlet fails I think we should pivot to the fallback and then include the assets for the fallback into the documents. With the dance I've described further up we are past the point that Early Hints is written to the browser when a content route fails and we are at a point where it can be added without trouble.

Yes, we have passed on a content asset to the browser as an Early Hint which the browser will not use (because content failed) and the Fallback assets was not pre sent as Early Hints but that I don't see that as super bad when stuff are erroring.

In the case on OF HTML streaming we already have the fallbacks (now a placeholder) asset references in memory, they can be forwarded to the browser asap, then polder Early Hints arrive and we can add placeholders (fallbacks) content into the HTML document and when the podlet content arrives its pushed on the stream, replacing the placeholders.

In the case of a podlets content request failing, the logic stands but the placeholder is not replaces and we have might pushed an content asset reference too much as a Early Hint to the Browser but yet again, its an error scenario so I really don't see how bad it is as we are actually erroring.

I can't see a way to start streaming the document early if one of the podlets might fail and suddenly need to swap which assets it wants placed in the browser.

I think the path to doing this is a little bit different than what you think. Maybe we should try to draw a flowchart on this?

digitalsadhu commented 2 weeks ago

Phase 1.

Non breaking. 103 early hints from podlets to layouts only. No early hints to the browser. Content route in podlet should send assets as early hints Fallback route in podlet should send assets as early hints Client should replace cached manifest assets for content with 103 early hint assets from the podlet Client should cache fallback assets in memory and set these with fallback responses

Screenshot 2024-09-02 at 11 22 54

digitalsadhu commented 2 weeks ago

Phase 2 design (breaking changes)

Early hints to the browser HTML streaming support Circuit breaking

Early hints

This will be a case of connecting the last part of early hints and sending out 103 early hints with asset preloads to the browser.

With Early Hints - success case

Screenshot 2024-09-02 at 12 18 19

With Early Hints - a podlet fails case

Screenshot 2024-09-02 at 12 18 32

HTML streaming support

We want to make it possible to stream the parts of the document that are ready out as soon as we can for perf reasons. The document head first, then a page with skeleton screens, then replace the skeleton screens with either content or fallback based on podlet response.

Screenshot 2024-09-02 at 11 33 39

Circuit breaking

Circuit breaking should be implemented between the layout and podlets so that when a podlet starts failing, the podlet's fallback and assets can be returned at once. A configurable percentage of requests can be allowed through to check if the podlet has recovered.

Screenshot 2024-09-02 at 12 34 58

digitalsadhu commented 1 week ago

Replaced by https://github.com/podium-lib/client/pull/417