nksaraf / vinxi

The Full Stack JavaScript SDK
https://vinxi.vercel.app
MIT License
2.14k stars 81 forks source link

"use server" chunks are duplicated in build #51

Open katywings opened 10 months ago

katywings commented 10 months ago

Is: If you create a file with "use server" and use it in a component/route file, the file code will be duplicated to two chunks. It will be in server/chunks/build/someRoute.mjs and also server/chunks/nitro/node-server.mjs.

Should: Ideally it should only be in the node-server chunk and from there exported. The someRoute chunk should only import the functions from node-server.

nksaraf commented 10 months ago

Yeah there are some of these inefficiencies in the bundling right now but that's because I have kept correctness the priority. Its there in someRoute because it could be called from the Component itself, so its part of that server's bundle. It's part of node-server.mjs because it can be called from the internet as well as a server function request.

Right now the way the builds work, the individual parts of the app ((ssr + server functions)) are built first (thus can have duplicated parts). Then the main server build takes the output of the individual parts and bundles them together. By that time its usually too late to dedupe.

We will work on better bundling strategies, but for now this is a decent compromise for the simplicity in the build process it enables. And since the output is for the server, the effect's usually not that large because the library code is not duplicated.

katywings commented 10 months ago

I wouldnt call this just an inefficiency, it completely breaks the assumptions one makes when writing js code and will likely lead to a buggy feeling experience for developers 🙈. Let me know if I can help with coming up with better bundling strategies 😊.

nksaraf commented 10 months ago

actually the assumptions one should make while writing a Vinxi app (or apps designed for the serverless world) is that the JS global namespace is not as meaningful. Most like everything is running on different instances. This could later be scaled to different routes on different serverless functions, or ssr/server-functions on different lambda functions.

To make room for these optimizations later, I want people to break the impression that they should assume that there is some identity of these functions as a normal JS functions. For both the server and the client, you should treat as special remote code.

But regardless, yesss always looking for help in coming up with better bundling strategies. I will probably to have to write about the current strategy then.

katywings commented 10 months ago

actually the assumptions one should make while writing a Vinxi app (or apps designed for the serverless world)

Maybe I am getting old under the young 😅.. Just my personal viewpoint of course, but i mostly wanna write selfhosted nodejs web servers. I came from php, which basically was exactly what you are describing "serverless" / "no global state", to nodejs because i like the flexibility of global state and long running processes, so serverless for me feels like a step backwards 😅.

But I also get your (and probably Ryans) point of view to focus on serverless 😉.

But regardless, yesss always looking for help in coming up with better bundling strategies

Awesome 🥳! If there was a way to dedupe the code on build, would you like to do the deduping regardless of serverless/nodejs target or would you only want it for nodejs? 🤔

nksaraf commented 10 months ago

I would do it regardless, if I can find a safe, not too much slower way of doing it, I agree that its worth pursuing, just not top priority right now.

nksaraf commented 10 months ago

Feel free to write up any ideas you have here, or questions about the current process.

katywings commented 10 months ago

I am wondering the following: what would happen if the files marked with "use server" (excluding "use server" functions), would be kept external during the build of the ssr/server-function chunks and if the transpilation of these files instead would happen during the nitro build? If that works, it would atleast dedupe the "use server" files, "use server" functions in components still would be duplicated of course 🤔.

I came to the idea at the top, when I observed the following: The ssr/server-function chunks currently seem to handle library code differently. When I import a "use server" file from a node_modules library, the ssr chunk just keeps the library external, the server-fns chunk on the other hand transpiles it. The final nitro build still includes the code twice: In node-server the server function is transpiled properly, in the "build/index" chunk it is the original code without createServerReference. These differing is probably another issue by itself, but maybe it would also be fixed via the changes of this issue here. Question is, if "use server" even is officially supported by Vinxi in node_modules library code..

nksaraf commented 10 months ago

it is meant to be, so if it doesn't probably a bug, probably need to optimizeDeps.exclude those dependencies that might have use server in them

nksaraf commented 10 months ago

But I think a solution along those lines might be valuable. If we can safely externalize stuff during the vite build.. it will only be included once in the rollup build. But I wonder what if the same file is loaded by two different systems and has to be traspiled differently, for example with server functions and functions called during SSR. we happen to have the same bundling for both , but its not necessarily the case. For eg. react RSC's would treat those calls differently.

katywings commented 10 months ago

I am not 100% familiar with RSC's, but isn't "use server" in RSCs always in Component/function scope? That would be an additional argument to only dedupe files where the "use server" is in module scope 🤔.

If thats the case, the solution would have to differentiate depending on scope:

katywings commented 10 months ago

Okay I came up with a thing, lets call it workaround. This does not solve the code duplication and only works in full "use server" files, but it allows me to use stable factories for handlers (maybe you remember my question about that in the Discord ;)).

Utility:

const handlers: Record<string, Function> = (globalThis as any).handlers = (globalThis as any).handlers || {};

let idx = 0;
export const cachedHandler = function<T extends Function>(creator: () => T): T {
  if (import.meta.env.DEV) {
    return creator();
  }

  idx += 1;
  if (!handlers[idx]) {
    handlers[idx] = creator();
  }

  return handlers[idx] as T;
}

Usage:

"use server"

import { cachedHandler } from "somewhere"

export const handler1 = cachedHandler(function() {
  const number = Math.random();
  console.log('Handler 1', number)
  return async () => number;
});

Result (valid): Bildschirmfoto vom 2023-12-29 12-03-26 Bildschirmfoto vom 2023-12-29 12-04-15


The reason why this works is, because the order of the handlers is stable, which is the case in "use server" file scoped. This does not work in "use server" function scoped, because those sadly are not registered on server startup but lazily on route request.

Result with "use server" function scope (invalid): Bildschirmfoto vom 2023-12-29 12-17-28

nksaraf commented 10 months ago

I think this is a nice hack. globalThis is actually your best bet for truly deduped instances of things. Because both HMR and different builds can be fixed with that.

katywings commented 10 months ago

Its an interesting POC indeed, but it is very dangerous... if the cachedHandler helper is called in a non "use server" file, all deduped instances become invalid, without any warning. I think the POC could be a "winner" if we can answer the following questions:

  1. Is there a runtime way in a "use server" file to know, that we currently are executing such a file? The cachedHandler function needs a check to make sure that it is used in "use server" file context. 1.a If there is no such way, can we implement it? (see 'Concept of the "use server" context check' below)
  2. Could a clean, official implementation of cachedHandler (maybe with a better name) become part of vinxi? (I am willing to contribute) 2.a Such a helper being part of vinxi would indicate, that deduping helpers are officially supported by vinxi and compatibility will not be broken 3 months from now 🙈.

I am asking these questions, because I have a huge migration from server$ to use server in my solid-start based CMS (https://nitropage.com) ahead of me. The project weights 23766 Lines of code. For the migration I count on the support of handler factories / deduped handlers. If cachedHandler breaks in 7 months because Vinxi decides to cut the stable order of "use server" files for some reason, that would be the killing blow for my CMS. Factories and deduped handlers are essential to the design of the public developer API of the CMS and I cannot completely redesign the public developer API after releasing v1.0 😉.


Concept of the "use server" context check:

Utility:

// At the beginning of "use server" files vinxi sets isUseServerFile to true
// At the end of "use server" files vinxi sets isUseServerFile to false
import { isUseServerFile } from "vinxi"
export const cachedHandler = function<T extends Function>(creator: () => T): T {
  if (!isUseServerFile()) {
    throw new Error('This helper can only be used in "use server" files!')
  }

  // deduping logic
}

Proper usage by some enduser:

"use server"
// Vinxi during transpilation magically adds code which sets isUseServerFile = true

export const getUser = cachedHandler(...)
// cachedHandler helper executed without error

// End of file:
// Vinxi magically sets isUseServerFile = false

Wrong usage - the enduser is warned about their wrong usage of cachedHandler:

// This is not a "use server" file, isUseServerFile = false

const customCode = function() {
  "use server"
  const handler = cachedHandler(...)
  // Error: This helper can only be used in "use server" files!
}
nksaraf commented 10 months ago

I got an idea of the solution, and its like a React hooks kind of solution. No problem.

But I think lets take a step back and go over the problem again with these handler factories. What is the issue with the code duplication when it comes to this factories?

Do the factories need to be singletons or deduped? i am not sure if they do

katywings commented 10 months ago

React hooks kind of solution

Exactly ^^

Do the factories need to be singletons or deduped? i am not sure if they do

Again, its the fact that the code currently gets duplicated. In the end its about the mental model and about state. If you don't care about variables in serverless environment and move everything to databases and globalState fine, but personally I dont need serverless. If I write code for the nodejs environment expect this code to exist once, if I write const randomNumber = Math.random() I expect that the server has one random number in memory, not two. I expect that if a factory creates a request handler, that it creates one request handler, not two 😅. Previously with server$ there was no duplication.

Personally I also prefer if we can find away to properly dedupe the code, instead of relying on "hooks".

nksaraf commented 10 months ago

While I agree that this is suboptimal, its for a very specific reason. Server functions and SSR are treated as two different environments. The code that is duplicated is in some ways accidental because of the way it gets used in both the server functions environment and ssr environment. And both of these might be different from the API environment and the client environment. That three of these environments are one shared node environment is a special case in a broad sense. It could be three node workers or one node worker, two cloudflare workers, etc. If you suddenly want to run your node server in multiple processes and scale that way, you don't want to have to re-architecture your app (data flow). Even if its just a node server and just the filesystem, it might be a good forcing function to get some persistence in your data.

For that mental model to be enabled, these environments are bundled separately. What is reasonable to try to achieve is that they are able to become one instance in the runtime, which can be done using globalThis (which you should anyway do to make HMR possible and not painful). I think assuming that a declared variable is going to be same reference in all environments could be a trap that causes bugs later in production.

nksaraf commented 10 months ago

But despite all this, I agree that if there is a clean solution to the deduping we should do it.

katywings commented 10 months ago

I think assuming that a declared variable is going to be same reference in all environments could be a trap that causes bugs later in production.

The reason why I am doing factory stuff is mostly performance actually 😅. As an example in my image optimization API I create hashes based on user options, the hashes are always the same on all cores, workers, etc. but I wouldn't want the server$ request handler to always recreate the same hash again and again, hashing is not exactly free... Here is an example of the current, fully implemented, working public image optimization API:

const [optimizeMedia] = useResize(
    server$(
      createResize({
        aspectRatios: ["16/9", "4/3", "1/1"],
        sizes: {
          xl: { width: 1520 },
          lg: { width: 800 },
          sm: { width: 400 },
        },
      }),
    ),
  );

server$(createResize(...)) creates an rpc endpoint that returns a secure hash for the provided optimization options. The central image optimization endpoint (middleware) only optimizes images if the options & hash are valid. This prevents invalid optimization requests (bruteforce attacks) and still gives the user of the API a ton of flexibility on a per-component level ;).


But to come back to this issue. I wanna present you an additional alternative: What is my primary goal? To store/cache information per RPC endpoint.

Going one step back: Each "use server" function in the background already has a stable id/name which is the same in both chunks - stability of the id/name combination is the backbone of vinxi's rpc mechanism. What if createServerReference would assign this identification to the event context like so:

export function createServerReference(fn, id, name) {
  return new Proxy(fn, {
    get(target, prop, receiver) {
      if (prop === "url") {
        return `/_server?id=${encodeURIComponent(id)}&name=${encodeURIComponent(name)}`;
      }
    },
    apply(target, thisArg, args) {
      const ogEvt = getRequestEvent();
      if (!ogEvt) throw new Error("Cannot call server function outside of a request");
      const evt = cloneEvent(ogEvt);
      // IN THIS LINE LIES THE MAGIC ⬇️
      evt.context.id = id + "#" + name;
      evt.serverOnly = true;
      return provideRequestEvent(evt, () => {
        return fn.apply(thisArg, args);
      });
    }
  });
}

I already tried this out (patched vinxi) and with this approach I can gain access to a stable "id" per "use server" endpoint. It can be used to store stuff in globalThis without having to worry about order or naming of global variables. The id stays consistent between the chunks (the whole rpc mechanism depends on this). This works in both use server components, files, etc...

With the rpc endpoint id in the event context, one can use something like this to store endpoint related stuff in globalThis:

import { getRequestEvent } from "solid-js/web";

type Ref<T> = { current: T|undefined }

const refs: Record<string, Ref<any>> = (globalThis as any).serverRefs = (globalThis as any).serverRefs || {};

export const useServerRef = function<T>(): Ref<T> {
  const event = getRequestEvent();
  if (!event?.context.id) {
    throw new Error('useServerRef is only supported in server handlers')
  }

  if (!refs[event.context.id]) {
    refs[event.context.id] = {} as any;
  }

  return refs[event.context.id];
}

This still wouldn't fix code duplication and a lot of code is still executed twice ;), but I think this little change in createServerReference would be an important step and opens up many possibilities for vinxi/solid-start users. What do you think? Btw: being able to access the endpoint id in events could also be useful for debugging :).

nksaraf commented 10 months ago

This doesn't even need to be vinxi level. This createServerReference is created by solid-start, vinxi is runtime independent. createServerReference can expose anything it wants. and you can also create the url system you want. Vinxi does come with a default createServerReference, but its meant to be overriden with your own specific one.

I agree the duplication and execution is not ideal.

nksaraf commented 10 months ago

This is the kind of solution I want hopinh to settle one. Something that the runtime can control

katywings commented 10 months ago

Okay :). I tried to write a proof-of-concept PR for solid-start, adding id and name to the request context and exporting a getRpcInfo helper function from @solid/start/server.

I don't know why, but using getRequestEvent in @solid/start/server breaks the vite dev server, it crashes with no useful error message. vite build on the other hand works fine, with a warning that makes zero sense to me 😂: "getRpcInfo" is imported from external module "h3" but never used in "node_modules/.pnpm/vinxi@0.0.61/node_modules/vinxi/runtime/server.js"

I created a WIP PR in solid-start anyways and would love your feedback -> https://github.com/solidjs/solid-start/pull/1203