Stories.json generation mechanism

tmeasday commented 3 years ago

Let's prototype a stories.json generator.

Here's an approximate design:

On startup (node)

Take a main.js:stories glob, e.g. './components/**/*.stories.*'
Transform each matching file with a CSF => stories.json transform[1]
Combine the data into a stories.json file (see examples in this repo)
Publish the data (in development) on /stories.json

After startup (node)

Setup a filewatcher (preferably the same one webpack uses) to watch the glob.
Whenever the glob has changes, recompute steps 2-3 (ideally with some cache)
Provide a websocket on /stories.json.socket that sends a ping every time the generated file changes.

In "storybook runtime"

Fetch /stories.json rather than taking it as an argument here: https://github.com/tmeasday/storybook-skeleton/blob/8f16448460c346af600ad5ab042f29f93bb6be6f/src/storybook/index.js#L6-L6
Open websocket and refetch the data whenever a ping is received.

Don't stress too much about step 7, I think @ndelangen has ideas on adding node to SB's channel API and I'd like to discuss it with him a bit for the actual implementation. Also we may fetch the data in the manager directly later but for now this way will prove the concept, as the skeleton doesn't include a manager implementation.

[1] @shilman - can you provide @bebraw with what he needs for this part please?

shilman commented 3 years ago

Current dev server code:

https://github.com/storybookjs/storybook/blob/db4901a052ce99211a7bf2c1a5eed83b58ee89e1/lib/core-server/src/dev-server.ts#L41

Current production build code:

https://github.com/storybookjs/storybook/blob/dea23e5e9a3e7f5bb25cb6520d3011cc710796c8/lib/core-server/src/build-static.ts#L70-L77

And the real API usage:

https://github.com/storybookjs/storybook/blob/dea23e5e9a3e7f5bb25cb6520d3011cc710796c8/lib/core-server/src/utils/stories-json.ts#L44-L51

tmeasday commented 3 years ago

@bebraw as discussed, I am not too concerned about the integration of the stories list (stories.json) into the manager or preview of SB at this stage, as I am mid way through refactoring the relevant code in both places.

So the most relevant steps are 1-7 above. Do whatever's easiest for you to test it out, whether that's integrating into the skeleton, or just building a simple HTML page that fetches /stories.json and watches the websocket.

What I'd like to focus on for 6.4, is nailing down the details of the generation as discussed above. Questions that I would like to see answered:

How long does it take to generate a stories list as a function of the number of stories files using the static technique that @shilman linked above?
Based on the info that watchpack gives us when a CSF file is added/deleted/modified, how quickly can we regenerate the stories list -- can we do it in a O(1) way using a cache on the unmodified files? If not how long does a regeneration take as a function of the number of CSF files.
Do the times for 1&2 above change when there are many non-CSF files within the watchpack root path / project root (say for instance 300k extra files as in my webpack repro)?

Some other smaller things that have come up so far that we should work in also:

What is the websocket protocol we are going to use? I'm guessing something super simple?
Can we ensure in the stories.json generation that the same story id is not used more than once? If so what should we do if it is? We could output an error in the CLI, although that might not get seen. We could add an errors field in the stories.json format, and show it in the UI. WDYT?
Similarly a CSF file with no exports should be flagged up.
We also need to sort the stories; the correct order needs to be known in both the manager and the preview. So I think the metadata server will need to ensure the stories in the list are in the right order. I'm not quite sure of the best way to execute custom story sort functions w/o dynamic CSF parsing though (i.e. executing them in node/JSDOM).

We probably also want to consider the "dynamic" technique for 6.4 as an opt-in as it is more flexible, but likely slower. There is a bit of thinking to do there though.

bebraw commented 3 years ago

I added a basic WebSocket server and client to #28 in order to find some answers to your questions. Initial comments below.

How long does it take to generate a stories list as a function of the number of stories files using the static technique that @shilman linked above?

On my system against design-system there's roughly 120 ms (140 ms when through ws) cost to calculating stories.json. That said, I am running on a fairly modern eight core system (2019, 16" mbp) so I imagine it can be some multiple of that on weaker systems. Likely this depends on Node version as well (I'm running on 12 so a bit behind).

Likely we'll want to test this against the biggest repository we can find. Do you have a good public example in mind?

Based on the info that watchpack gives us when a CSF file is added/deleted/modified, how quickly can we regenerate the stories list -- can we do it in a O(1) way using a cache on the unmodified files? If not how long does a regeneration take as a function of the number of CSF files.

That's a great point as watchpack gives us the changed path (or multiple if we run it in a batched mode). Updating only the changed entries sounds like a good experiment to make. I imagine that will bring the cost of stories.json generation low on update. It's something I'll do in the next development pass now that the foundation is there.

Do the times for 1&2 above change when there are many non-CSF files within the watchpack root path / project root (say for instance 300k extra files as in my webpack repro)?

That's another good question.

I guess ideally we would find some way to measure the delay from touching a file in the file system to watchpack triggering a call at the websocket server. We can get some initial results manually here by logging.

What is the websocket protocol we are going to use? I'm guessing something super simple?

I went with the standard. The implementation is built on top of https://www.npmjs.com/package/ws.

Can we ensure in the stories.json generation that the same story id is not used more than once? If so what should we do if it is? We could output an error in the CLI, although that might not get seen. We could add an errors field in the stories.json format, and show it in the UI. WDYT?

Likely this is something that should live in the stories.json generation algorithm. How should this case be communicated to the client? On a first thought, likely we'll have to specify a frame ({ error: undefined, warning: undefined, payload: { ... } | undefined }) that can contain this kind of data and that the client can parse when receiving a message.

Similarly a CSF file with no exports should be flagged up.

It sounds like a similar problem. I think the algorithm can be improved to detect this case.

We also need to sort the stories; the correct order needs to be known in both the manager and the preview. So I think the metadata server will need to ensure the stories in the list are in the right order. I'm not quite sure of the best way to execute custom story sort functions w/o dynamic CSF parsing though (i.e. executing them in node/JSDOM).

Same. Does the requirement to sort in server-side come from static rendering? For development only, we can probably skip the problem by sorting only in the frontend. If sorting requires something like JSDOM, then I would leave the heavy operation for static generation only.

The problems 5-7 feel separate from the first four as they are specific to stories.json.

tmeasday commented 3 years ago

Likely we'll want to test this against the biggest repository we can find. Do you have a good public example in mind?

Hmm, let @shilman and I try and come up with something.

Same. Does the requirement to sort in server-side come from static rendering? For development only, we can probably skip the problem by sorting only in the frontend.

We need the stories in sorted order both on the manager (it lists them) and the preview (it needs to know how to pick the first story). Trying to include the sort function in both bundles is going to be a pain, so it seems logical to do the sorting in the metadata server/node context if possible.

I suspect there is going to be a back-compat problem with the storySort function in any case @shilman

bebraw commented 3 years ago

We need the stories in sorted order both on the manager (it lists them) and the preview (it needs to know how to pick the first story). Trying to include the sort function in both bundles is going to be a pain, so it seems logical to do the sorting in the metadata server/node context if possible.

Ok, we can pass a sort data structure to the client after figuring out the order at the server-side. I don't understand the sorting problem in detail so maybe Michael can comment on this. It's an independent problem anyhow.

tmeasday commented 3 years ago

The sorting problem can be understood if you read Storybook's documentation on the feature.

If we were to try and do the sorting statically[1], we do the following:

Parse preview.js to an AST, and pull out an export called parameters, check for the field options.storySort.
If that field is an object, we can convert the AST into the equivalent object in node, and apply SB's current implementation to our story list.
If that field is a function, things are a bit trickier. In this case, we should just eval the function's source and hope it doesn't use anything from the outer scope[2], then pass in the story list we've created.

[1] i.e. without trying to evaluate preview.js in node, which is a can of worms.

[2] This is a breaking change, but so is the fact that the story list doesn't contain any parameters, which it is quite possible a storySort function expects. I think @shilman and I are comfortable adding these restrictions long term (a storySort function must be serializable and can only access id, name, kind), but this will definitely factor into our back compat story for 6.4.

shilman commented 3 years ago

I'm sure people will howl about that (id, name, kind) restriction, but let's see whether we can pull it off first and then look at how people react and perhaps whether we can improve upon this

bebraw commented 3 years ago

I added a fast demo of AST based analysis for storySort at #28. It's also doing some kind of eval for the function case (only arrow for now) to show how it might go. Most likely there are corner cases missing.

tmeasday / storybook-skeleton