withastro / roadmap

Ideas, suggestions, and formal RFC proposals for the Astro project.
290 stars 29 forks source link

Improved env handling #837

Closed florian-lefebvre closed 4 months ago

florian-lefebvre commented 6 months ago

Summary

This RFC aims to improve DX (and eventually security) around env variables.

Background & Motivation

Env variables are an important part of any web application. You need to store sensitive data (think API secrets, tokens etc) without being leaked inside your git repo. But that's only the 1st part of the story. It's easy to leak this data by importing in the wrong place, eg. the frontend like Resend a few weeks ago.

Other JS frameworks (eg. SvelteKit) are handling env pretty well. From my understanding, the env story is currently a bit tricky in Astro. According to the docs, here is how env variables are currently handled:

Goals

Non-Goals

natemoo-re commented 6 months ago

Great work @florian-lefebvre! I’m very excited to finally be tackling this problem.

florian-lefebvre commented 6 months ago

We discussed a little bit already about about those features could look. Here are some ideas:

  1. Static variables can be made available though a virtual module and defined in the Astro config. Type safety is not an issue using codegen (depends on https://github.com/withastro/astro/pull/9952).
    export default defineConfig({
    experimental: {
        env: {
            variables: ["FOO", "BAR"]
        }
    }
    })
    import { FOO, BAR } from "astro:env"
    // or import * as env from "astro:env"
  2. Static variables validation can be easily achieved from the variables list. We could allow users to choose how strict they want the check to be:
    export default defineConfig({
    experimental: {
        env: {
            variables: ["FOO", "BAR"],
            checkLevel: "warn" // "error"
        }
    }
    })
  3. Public static variables would still be using the PUBLIC_ prefix
  4. Dynamic variables would be accessible through a virtual module as well. I know @lilnasy suggested using something like Astro.env.KEY but I think the issue is that it makes it hard to use env variables outside of .astro files and endpoints
  5. Dynamic variables "security" could be improved. Currently, using process.env does not prevent from using a secret dynamic variable on the client-side. Since we'll have an abstraction for this, maybe we could use a proxy and depending on the key being accessed (ie. starting with the PUBLIC_ prefix), we could check if it's being access on the client-side (!import.meta.env.SSR) and return undefined
  6. I'm not used to the adapter API so I don't have much ideas about how it should look
bholmesdev commented 6 months ago

I appreciate these changes! Looking at the config, it seems odd to me that I can't have an "optional" check for variables. I agree Zod is overkill here, but primitive options per-variable, instead of a global "warn vs. error" property, would be nice.

florian-lefebvre commented 6 months ago

what do you think about something like this?

env: {
    variables: ({ string }) => ({
        FOO: string(),
        BAR: string({ optional: true })
    })
}

This way, we can set optional variables + it prepares the ground for more advanced schema

NuroDev commented 6 months ago

Or if it was to be even simpler something along these lines could work?

export default defineConfig({
    experimental: {
        env: {
            variables: {
                FOO: true, // Or possibly `null` instead? Some other kind of truthy value
                BAR: { optional: true, }
            },
            checkLevel: "warn" // "error"
        }
    }
});
florian-lefebvre commented 6 months ago

could be! the advantage of having string({ optional: true }) is that in the future we could have stuff like this

env: {
    variables: ({ string, number, boolean, enum }) => ({
        FOO: string({ minLength: 5 }),
        BAR: string({ optional: true }),
        TRUE: boolean(),
        ENV: enum(["dev", "prod"])
    })
}
NuroDev commented 6 months ago

IMO if the plan is to offer that level of validation control then it may make more sense to just expose Zod like is already done in content collection configs?

florian-lefebvre commented 6 months ago

I think it's better to avoid zod here for 2 reasons

  1. It's hard to serialize and deserialize (we'll need to pass stuff to a virtual import, I already tried and it requires at least 2 3rd party libs + some hacks; or a distinct entrypoint but it feels like it's a bit too much). It's better if we own format so that we can generate a schema ourselves
  2. zod contains a lot of stuff not relevant to env (eg. functions, objects, void etc) and not everything will work as expected. For example:
    • z.number won't work, you need to use z.coerce.number
    • z.boolean is even trickier (see https://env.t3.gg/docs/recipes#booleans). By having our own format we can abstract those for the users and improve the DX
florian-lefebvre commented 6 months ago

Maybe related to this RFC https://github.com/withastro/docs/discussions/5328

alexanderniebuhr commented 6 months ago

I've been giving it some thought, and I'm leaning towards not reinventing the wheel when it comes to validation logic. Considering the existing solutions available, and the familiarity that Astro users have with zod, I'd warmly recommend that we either skip validation altogether or opt to integrate zod into our implementation.

florian-lefebvre commented 6 months ago

I'm not suggesting to not use zod at all, just not in the public interface. I've been working on it tonight for astro-env and as soon as I have something to show (probably this week), I'll post here

florian-lefebvre commented 5 months ago

Well I got distracted by other contributions since then 😅 but here is the PR! Useful links are on the PR description to see how the API looks: https://github.com/florian-lefebvre/astro-env/pull/4

matthewp commented 5 months ago

@florian-lefebvre have you thought about runtime environment variables much? this is something that's come up recently with @astrojs/db which needs to read runtime variables but can't easily do so while working in various non-Node environments.

Here's the start of an idea (maybe we can improve)

// Default export
import getEnv from "astro:env"

getEnv('FOO'); // always retrieved at runtime

adapter.js

'astro:config:done': ({ setAdapter }) => {
  setAdapter({
    name: '@matthewp/my-adapter',
    serverEntrypoint: '@matthewp/my-adapter/server.js',
    envEntrypoint: '@matthewp/my-adapter/env.js'
  });
},

env.js

export default function(key) {{
  return Deno.getEnv(key);
}

cc @alexanderniebuhr not sure if this works for Cloudflare or not.

florian-lefebvre commented 5 months ago

@matthewp I suggested this and this is kinda what I have in mind! But apparently it has to be tied to the request because of cloudflare (so it would most likely be available as Astro.env.FOO)

alexanderniebuhr commented 5 months ago

So for Cloudflare there is no runtime without a request, so once you have the runtime you already have the request chain and then the env is accessible inside the request handler in Cloudflare. We do write the env to Astro.locals.runtime.env, but we can also write to to something else.

So as long as env.js has access to something, which is writable inside the request handler exported by serverEntrypoint, e.g. the lines of app.render. I don't see any blocks with Cloudflare.. But once we have a POC, I'm more than happy to test it with Cloudflare

florian-lefebvre commented 5 months ago

I'm not familiar enough with the adapters api to know what's possible tbh!

matthewp commented 5 months ago

Ok, I figured this was probably a constraint. The only issue with the Astro.env idea is that's only accessible by Astro components. I think we'll want to figure out a solution that works for any module, Astro, JS or otherwise.

lilnasy commented 5 months ago

The solution could be AsyncContext or AsyncLocalStorage - users would call getEnv() wherever and it will retrieve from the environment associated with currently "active" request. It would still have to error at the top-level on cloudflare because there is active request and no global environment.

AsyncLocalStorage is a WinterCG standard, but requires enabling node compat on cloudflare, which isn't straightforward from what I understand, and that may be a blocker. AsyncContext is a stage 2 tc39 proposal.

alexanderniebuhr commented 5 months ago

It would still have to error at the top-level on cloudflare because there is active request and no global environment.

Correct I think that is important to understand that there is a "global" context, but it is not recommended to use it for env, and we need to think about env per request.. However this is not a Cloudflare only issue, other runtimes with a isolate approach have the same logic.

AsyncLocalStorage is a WinterCG standard, but requires enabling node compat on cloudflare, which isn't straightforward from what I understand, and that may be a blocker.

There is some nuances to this, so AsyncLocalStorage wouldn't work out-of-the-box with Cloudflare Pages, however it does work with the nodejs_compat flag _(which is different to the node_compat flag)_. Additionally the import would have to look like import { AsyncLocalStorage } from 'node:async_hooks'; ref: https://developers.cloudflare.com/workers/runtime-apis/nodejs/asynclocalstorage/

Enabling nodejs_compat in theory doesn't have any downsides for the user _(node_compat has downsides)_, users need to configure it manually to make sure their Astro project works though. I don't know if that is something we want, looking at UX? There are plans to use wrangler.toml configuration file for Cloudflare Pages, once Cloudflare ships that, we could add a base wrangler.toml with the correct flag settings when using astro add cloudflare and document it when users choose to install manually.

In addition to that, we would have to decide if we want to add a check and tell users they forgot the flag or not? ref: https://github.com/cloudflare/next-on-pages/blob/9900517223d34612d503b0489d3383c3b4453cd2/packages/next-on-pages/templates/_worker.js/index.ts#L34-L37

ematipico commented 5 months ago

@alexanderniebuhr out of curiosity, how do you set and read environment variables with Cloudflare? Code wise.

alexanderniebuhr commented 5 months ago

So I'll try to answer this as detailed as possible, but still as consise as possible.

The whole message only talks about "runtime" environment variables, build/compile time environment variables, are a completly different story.

You do set environment variables using the Cloudflare Dashboard, for Workers you can also use wrangler.toml and .dev.vars files, Pages will get config file support in the future, but it doesn't have it yet.

Pages/Workers (I'll try to make sure to highlight any differences if needed) do have a concept of environment variables on a per-Page/Worker and per-request basis. These are not accessible automatically via the process.env API. It is possible to manually copy these values into process.env if you need to, and those will be globally persistent for all Workers running in the same isolate and context. Be aware, however, that means env variables could leak inside the same isolate and context. In addition any value on process.env will coerce that value into a string.

Cloudflare recommends strongly that you/framworks do not replace the entire process.env object with the request env object. It will cause unexpected bahviour for other Workers running in the same isolate.

Cloudflare has some docs around this.


In general you can access the environment variables in the request handler. This looks a little different for Workers and Pages, but the concept is general the same. Pages (Astro) uses the Module Worker Syntax.

export default {
  async fetch(request, env, ctx) {
    // env is the object that contains your environment variables and bindings.
  },
};

We (Astro) currently overwrite the whole process.env completly (reference), which I elaborated on above, is not the correct way. The next major version of the Adapter v10, will remove this. (However that means right now users should be able to access the environment variables via process.env as long as they only need a string value. This wouldn't work for any Cloudflare Bindings)

The other options for Users to access the request based environment variables is via Astro.locals.runtime.env (reference). However this has the downside that it doesn't work in .js or .ts files.


So that means there is no "global" way for a Astro User to access env variables, because we handle the request handler as serverEntrypoint and need to provide a way for users to access the env object which is a parameter of the request handler, we can decide how that should look like. (I don't know how other frameworks handle it, but we can explore that)

matthewp commented 5 months ago

@florian-lefebvre is it a goal that you can use this package within your Astro config? Many users ask for the ability to use import.meta.env in their Astro config but they cannot. Not saying this proposal needs to solve this problem, just asking.

florian-lefebvre commented 5 months ago

No this is out of scope, actually I didn't even know people wanted that!

ematipico commented 5 months ago

No this is out of scope, actually I didn't even know people wanted that!

Yeah it's something that a lot of people started to do since JS config files started to be a thing: ability to tweak it based on environment variables.

I would personally discourage this kind of pattern

alexanderniebuhr commented 5 months ago

I very good piece of context, why it also makes sense for Cloudflare to have request based env variables:

https://blog.cloudflare.com/workers-environment-live-object-bindings#why-is-env-a-parameter-to-fetch-not-global

florian-lefebvre commented 4 months ago

Alright so time to recap!

Recap and API proposal based on feedback

Astro config

import { defineConfig, envField } from "astro/config"

export default defineConfig({
    experimental: {
        env: {
            staticSchema: {
                FOO: envField.string(),
                BAR: envField.string({
                    optional: false
                })
            },
            dynamicSchema: {
                // Same
            }
        }
    }
})

Adapter API

I'm still not familiar with this API but it will probably expose a dedicated endpoint to resolve dynamic env. We'll have to figure out how to make it work with cloudflare since env is tied to the request.

Usage of static env

import { FOO, BAR } from "astro:env/static"

It uses import.meta.env under the hood so client access should be safe for sensitive data

Usage of dynamic env

---
Astro.env.FOO
Astro.env.BAR
---

This allows any adapter to work. However, it would still be nice to support a virtual import for usage outside of astro files. This requires AsyncLocalStorage so for example, it will require explicit opt-in for cloudflare through the compat flag:

import { getEnv } from "astro:env/dynamic"

Your feedback is needed

I need your help on this precise topics:

  1. How do we want to call the 2 types of env variables? I went for static/dynamic but I'm open to anything
  2. How do we feel about having 2 distinct schemas for static/dynamic vars?
  3. What do you think about envField? Just to clarify, envField.string(options) returns { type: "string", ...options } under the hood
  4. What do you think about having 2 APIs for dynamic variables? I don't see any other possibilities to remain cloudflare-friendly

The plan

I plan to implement stuff little by little as it will be easier to work on and to review:

  1. Static env (staticSchema in the astro config, astro:env/static and validation in a vite plugin)
  2. envField
  3. Dynamic env (dynamicSchema in the astro config, adapter API and Astro.env)
  4. Dynamic env virtual module astro:env/dynamic using ALS
matthewp commented 4 months ago

On types

Since environment variables are always strings what happens if you do envField.number()? With Zod that would throw. What we really want in this case is to cast it, right? Assuming so, what happens if we make it cast, but it's NaN? In other words, is there special handling for all of these built-in types?

static vs dynamic

I don't really see the reason for the static vs. dynamic split in the config (or in the modules). What do we gain from that?

florian-lefebvre commented 4 months ago

Since environment variables are always strings what happens if you do envField.number()? With Zod that would throw. What we really want in this case is to cast it, right? Assuming so, what happens if we make it cast, but it's NaN? In other words, is there special handling for all of these built-in types?

Under the hood, that would be z.coerce.number(). I think a case where such an abstraction shines is for booleans:

// under the hood
z
      .string()
      // only allow "true" or "false"
      .refine((s) => s === "true" || s === "false")
      // transform to boolean
      .transform((s) => s === "true"),

I don't really see the reason for the static vs. dynamic split in the config (or in the modules). What do we gain from that?

Maybe the benefit is to make sure people won't use the dynamic way for static variables? I'm fine with using one schema tho

alexanderniebuhr commented 4 months ago

Since environment variables are always strings

@matthewp That's not 100% true for Cloudflare 👀

And it's really important that we don't have that limitation, because for a Cloudflare Binding the env has to have different types, even function or object. We should not coerce them!

florian-lefebvre commented 4 months ago

I forgot about this little detail... So adapters should be able to override how zod works under the hood and possibly extend them? But that's not true for static variables right?

alexanderniebuhr commented 4 months ago

That is only relevant for runtime variables which you set via Cloudflare Dashboard or wrangler.toml.

matthewp commented 4 months ago

I'm not sure about supporting functions and objects, etc. The goal of the RFC says:

Provide a fully type-safe experience for environment variables, without manual type definitions

Environment variables can't be functions or objects. They are always strings that can be cast to so something else. So casting a JSON string to an object seems fine. But functions doesn't make sense to me.

Also, these things won't exist in dev mode, so I would expect them to throw unless marked as optional.

matthewp commented 4 months ago

I would like to see the type that's returned by envField.string(). If there's a desire to extend this type then it needs to be well-defined.

florian-lefebvre commented 4 months ago

This is just syntactic sugar for a discriminated union:

{
    type: "string",
    optional: false
    // ...
}

I just like this API à la Keystatic

ematipico commented 4 months ago

I also don't see the value of the distinction between static and dynamic; in fact, I think it would make things confusing. I think we should start really simple.

Regarding the consumption of those environment variables, I think we should provide only getEnv. I am not really sure about import { FOO } from "astro:env", I prefer to have the environment variables protected via a function.

I know that in pure ESM land doing this is illegal:

import { CONST } from "./module.js";

CONST = "some other value"

But astro:env is a virtual module that gets in-lined to pure source code, so I am afraid a user might be able to override an environment variable.

bluwy commented 4 months ago

Awesome work! I quite like the direction for the validation.

I don't mind the separation between static and dynamic fields. That makes it clear what environment variables can be dynamically set in the deployment platform. If we combine it, maybe we could also support something like envField.static() and envField.dynamic()?

I also think that we should limit the env values to primitives like strings and numbers for now.

Also, do we want to tackle prevention of leaking sensitive environment variables? Unlike SvelteKit with its separate dimension of public/private, I think in Astro we can assume static = public and dynamic = private. And we make sure that dynamic env vars are not used in the client side.

Lastly, I'm a bit concerned with how dynamic env var will be implemented. Would we have to do the validation in runtime instead? And does that brings in a lot of dependency?

florian-lefebvre commented 4 months ago

About separating static/dynamic, here is what I shared at the end of the cloudflare env thread:

Actually, reading this convo reminded me why I think we need 1 schema per env variables type (static/dynamic). In the case of a cloudflare binding, that's a runtime thing but people would be able to try to access it through astro:env/static, and that would fail. Maybe there's a more elegant way than having 2 schemas, like adding a property on each field

I think Bjorn's idea is pretty elegant: we could have something like this:

// or envField.dynamic
envField.static({
    type: "string",
    optional: false
})
// OR the opposite
envField.string({
    context: "static", // "dynamic"
    optional: false
})

I don't mind having getEnv for static usage as well. We just need to see if there's a single getEnv for both static and dynamic values.

Regarding validation of dynamic env, I think it will be done at runtime (when calling getEnv, maybe we have a way to add caching within the same request) using zod since it's already included in astro. It should not require any other library afaik

Also, do we want to tackle prevention of leaking sensitive environment variables? Unlike SvelteKit with its separate dimension of public/private, I think in Astro we can assume static = public and dynamic = private. And we make sure that dynamic env vars are not used in the client side.

I think static variables are already protected because they'll use import.meta.env under the hood. I don't know how we should do it for dynamic variables tho. I guess we could ship a different virtual module on the client that doesn't even import sensitive data, idk

bluwy commented 4 months ago

Regarding validation of dynamic env, I think it will be done at runtime (when calling getEnv, maybe we have a way to add caching within the same request) using zod since it's already included in astro. It should not require any other library afaik

I'm not sure if zod is part of the bundle, IIRC it's only used during build-time, so if someone use dynamic env vars, that includes around ~50kb of code to the bundle. There's also the handling to serialize the dynamic schema from astro.config.js to the bundle runtime.

I don't know how we should do it for dynamic variables tho. I guess we could ship a different virtual module on the client that doesn't even import sensitive data, idk

If it's a virtual module, it should be simple to check so, example. And we could decide to stub/warn/error something.

florian-lefebvre commented 4 months ago

If zod is problematic, I think it's fine to do some validation ourselves. I mean, given that we are going to only support a tiny subset of zod's api, it will be easy (and lightweight) to implement

bholmesdev commented 4 months ago

Of the APIs I've seen presented, I'd prefer an extension function like envField.static(...) and envField.dynamic(...). These match the type conventions for Astro DB columns.

That said, I'm not very comfortable with the "static" vs. "dynamic" naming. I haven't seen this convention in other frameworks, and it isn't immediately clear as a user what these names mean. I also noticed Bjorn mention this above:

Unlike SvelteKit with its separate dimension of public/private, I think in Astro we can assume static = public and dynamic = private.

If this is the case, would it make sense to use a more standard convention like envField.variable(...) and envField.secret(...), or envField.public(...) and envField.private(...)? Having used platforms like Vercel, GitHub, and Cloudflare, I understand the difference between variables and secrets as a user. The encoding mechanism (static vs. dynamic) becomes an implementation detail users don't have to think about.

florian-lefebvre commented 4 months ago

I don't think saying "static = public and dynamic = private" is true. If I understand correctly, we currently have:

Build time Runtime
Public import.meta.env.PUBLIC_FOO process.env.*
Private import.meta.env.FOO N/A

I think public/private could still be kept defined by the PUBLIC_ prefix, but having a way to distinct build time / runtime variables is really important imo. I think that's the nice part about sveltekit, they have clear imports for each of these 4 cases (see docs).

bluwy commented 4 months ago

(I think you have runtime-public and runtime-private flipped)

I don't think supporting runtime-public makes a lot of sense. For buildtime-private, yeah I was thinking that maybe we don't support it and force users to runtime-private instead. If it was sensitive, maybe it shouldn't be inlined in the first place. And that users will always have to set the private env var on their deployment platform.

If we still want to support it, I also don't mind, just that I would prefer a chain API like:

env.static().optional() // implicitly private by default?
env.static().public().default("")
env.dynamic().private().optional()
env.dynamic().public().optional()

If the API is like envField.static({ type: "string", optional: false }), we could directly accept an object instead? { type: "string", scope: "static", optional: false }

florian-lefebvre commented 4 months ago

I quite like this API tbh! I don't especially mind supporting or not public dynamic variables either

I see a few issues with chaining tho. What happens if we have this case?

schema: {
    FOO: env.static().public()
}

Since it doesn't have the PUBLIC_ prefix, should we error and explain it's required? What about this case if we don't support public dynamic vars?

schema: {
    PUBLIC_FOO: env.dynamic()
}

Should we error saying it can't have this prefix?

bluwy commented 4 months ago

Yeah I think we should error in those two cases and treat it as a config validation error. We currently have zod validate the Astro config, and we could do a nested validation within it too.

matthewp commented 4 months ago

What is the definition of static and dynamic? What do these terms mean (in the context of environment variables)?

florian-lefebvre commented 4 months ago

Static means available at build time through import.meta.env. Dynamic means available at runtime and depends on the adapter, eg process.env or Deno.env.get()

alexanderniebuhr commented 4 months ago

I also struggled with those terms at the beginning, maybe we should refer to them as build/compile-time vs runtime variables?

florian-lefebvre commented 4 months ago

Yeah it's a bit longer but way clearer

ematipico commented 4 months ago

Good suggestion @alexanderniebuhr, I prefer this naming convention!

florian-lefebvre commented 4 months ago

So to recap since last time I asked for feedback

How do we want to call the 2 types of env variables? I went for static/dynamic but I'm open to anything

Let's go for buildTime / runtime

How do we feel about having 2 distinct schemas for static/dynamic vars? What do you think about envField? Just to clarify, envField.string(options) returns { type: "string", ...options } under the hood

Instead of having 2 schemas, it will be part of envField as suggested by Bjorn

What do you think about having 2 APIs for dynamic variables? I don't see any other possibilities to remain cloudflare-friendly

We are going to use ALS + virtual import only.


More questions now!

  1. Ema expressed concerns about overriding constants exported from astro:env/static. I still think it improves the DX (it's also done by sveltekit) but I'd like your opinions. If we don't do this, what should we do? I think we need 2 distinct imports anyway at least?
  2. Do we want to support public runtime variables? Or are we fine with only keeping them private (ie. always undefined on the client even with the PUBLIC_ prefix)?