Closed florian-lefebvre closed 7 months ago
Great work @florian-lefebvre! I’m very excited to finally be tackling this problem.
We discussed a little bit already about about those features could look. Here are some ideas:
export default defineConfig({
experimental: {
env: {
variables: ["FOO", "BAR"]
}
}
})
import { FOO, BAR } from "astro:env"
// or import * as env from "astro:env"
variables
list. We could allow users to choose how strict they want the check to be:
export default defineConfig({
experimental: {
env: {
variables: ["FOO", "BAR"],
checkLevel: "warn" // "error"
}
}
})
PUBLIC_
prefixAstro.env.KEY
but I think the issue is that it makes it hard to use env variables outside of .astro
files and endpointsprocess.env
does not prevent from using a secret dynamic variable on the client-side. Since we'll have an abstraction for this, maybe we could use a proxy and depending on the key being accessed (ie. starting with the PUBLIC_
prefix), we could check if it's being access on the client-side (!import.meta.env.SSR
) and return undefinedI appreciate these changes! Looking at the config, it seems odd to me that I can't have an "optional" check for variables. I agree Zod is overkill here, but primitive options per-variable, instead of a global "warn vs. error" property, would be nice.
what do you think about something like this?
env: {
variables: ({ string }) => ({
FOO: string(),
BAR: string({ optional: true })
})
}
This way, we can set optional variables + it prepares the ground for more advanced schema
Or if it was to be even simpler something along these lines could work?
export default defineConfig({
experimental: {
env: {
variables: {
FOO: true, // Or possibly `null` instead? Some other kind of truthy value
BAR: { optional: true, }
},
checkLevel: "warn" // "error"
}
}
});
could be! the advantage of having string({ optional: true })
is that in the future we could have stuff like this
env: {
variables: ({ string, number, boolean, enum }) => ({
FOO: string({ minLength: 5 }),
BAR: string({ optional: true }),
TRUE: boolean(),
ENV: enum(["dev", "prod"])
})
}
IMO if the plan is to offer that level of validation control then it may make more sense to just expose Zod like is already done in content collection configs?
I think it's better to avoid zod here for 2 reasons
z.number
won't work, you need to use z.coerce.number
z.boolean
is even trickier (see https://env.t3.gg/docs/recipes#booleans). By having our own format we can abstract those for the users and improve the DXMaybe related to this RFC https://github.com/withastro/docs/discussions/5328
I've been giving it some thought, and I'm leaning towards not reinventing the wheel when it comes to validation logic. Considering the existing solutions available, and the familiarity that Astro users have with zod
, I'd warmly recommend that we either skip validation altogether or opt to integrate zod
into our implementation.
I'm not suggesting to not use zod at all, just not in the public interface. I've been working on it tonight for astro-env
and as soon as I have something to show (probably this week), I'll post here
Well I got distracted by other contributions since then 😅 but here is the PR! Useful links are on the PR description to see how the API looks: https://github.com/florian-lefebvre/astro-env/pull/4
@florian-lefebvre have you thought about runtime environment variables much? this is something that's come up recently with @astrojs/db
which needs to read runtime variables but can't easily do so while working in various non-Node environments.
Here's the start of an idea (maybe we can improve)
// Default export
import getEnv from "astro:env"
getEnv('FOO'); // always retrieved at runtime
adapter.js
'astro:config:done': ({ setAdapter }) => {
setAdapter({
name: '@matthewp/my-adapter',
serverEntrypoint: '@matthewp/my-adapter/server.js',
envEntrypoint: '@matthewp/my-adapter/env.js'
});
},
env.js
export default function(key) {{
return Deno.getEnv(key);
}
cc @alexanderniebuhr not sure if this works for Cloudflare or not.
@matthewp I suggested this and this is kinda what I have in mind! But apparently it has to be tied to the request because of cloudflare (so it would most likely be available as Astro.env.FOO
)
So for Cloudflare there is no runtime without a request, so once you have the runtime you already have the request chain and then the env is accessible inside the request handler in Cloudflare. We do write the env to Astro.locals.runtime.env
, but we can also write to to something else.
So as long as env.js
has access to something, which is writable inside the request handler exported by serverEntrypoint
, e.g. the lines of app.render
. I don't see any blocks with Cloudflare.. But once we have a POC, I'm more than happy to test it with Cloudflare
I'm not familiar enough with the adapters api to know what's possible tbh!
Ok, I figured this was probably a constraint. The only issue with the Astro.env
idea is that's only accessible by Astro components. I think we'll want to figure out a solution that works for any module, Astro, JS or otherwise.
The solution could be AsyncContext
or AsyncLocalStorage
- users would call getEnv()
wherever and it will retrieve from the environment associated with currently "active" request. It would still have to error at the top-level on cloudflare because there is active request and no global environment.
AsyncLocalStorage
is a WinterCG standard, but requires enabling node compat on cloudflare, which isn't straightforward from what I understand, and that may be a blocker. AsyncContext
is a stage 2 tc39 proposal.
It would still have to error at the top-level on cloudflare because there is active request and no global environment.
Correct I think that is important to understand that there is a "global" context, but it is not recommended to use it for env, and we need to think about env per request.. However this is not a Cloudflare only issue, other runtimes with a isolate approach have the same logic.
AsyncLocalStorage
is a WinterCG standard, but requires enabling node compat on cloudflare, which isn't straightforward from what I understand, and that may be a blocker.
There is some nuances to this, so AsyncLocalStorage
wouldn't work out-of-the-box with Cloudflare Pages, however it does work with the nodejs_compat
flag _(which is different to the node_compat
flag)_.
Additionally the import would have to look like import { AsyncLocalStorage } from 'node:async_hooks';
ref: https://developers.cloudflare.com/workers/runtime-apis/nodejs/asynclocalstorage/
Enabling nodejs_compat
in theory doesn't have any downsides for the user _(node_compat
has downsides)_, users need to configure it manually to make sure their Astro project works though. I don't know if that is something we want, looking at UX?
There are plans to use wrangler.toml
configuration file for Cloudflare Pages, once Cloudflare ships that, we could add a base wrangler.toml
with the correct flag settings when using astro add cloudflare
and document it when users choose to install manually.
In addition to that, we would have to decide if we want to add a check and tell users they forgot the flag or not? ref: https://github.com/cloudflare/next-on-pages/blob/9900517223d34612d503b0489d3383c3b4453cd2/packages/next-on-pages/templates/_worker.js/index.ts#L34-L37
@alexanderniebuhr out of curiosity, how do you set and read environment variables with Cloudflare? Code wise.
So I'll try to answer this as detailed as possible, but still as consise as possible.
The whole message only talks about "runtime" environment variables, build/compile time environment variables, are a completly different story.
You do set environment variables using the Cloudflare Dashboard, for Workers you can also use wrangler.toml
and .dev.vars
files, Pages will get config file support in the future, but it doesn't have it yet.
Pages/Workers (I'll try to make sure to highlight any differences if needed) do have a concept of environment variables on a per-Page/Worker and per-request basis. These are not accessible automatically via the process.env API. It is possible to manually copy these values into process.env if you need to, and those will be globally persistent for all Workers running in the same isolate and context. Be aware, however, that means env variables could leak inside the same isolate and context. In addition any value on process.env
will coerce that value into a string
.
Cloudflare recommends strongly that you/framworks do not replace the entire process.env
object with the request env object. It will cause unexpected bahviour for other Workers running in the same isolate.
Cloudflare has some docs around this.
In general you can access the environment variables in the request handler. This looks a little different for Workers and Pages, but the concept is general the same. Pages (Astro) uses the Module Worker Syntax.
export default {
async fetch(request, env, ctx) {
// env is the object that contains your environment variables and bindings.
},
};
We (Astro) currently overwrite the whole process.env
completly (reference), which I elaborated on above, is not the correct way. The next major version of the Adapter v10, will remove this. (However that means right now users should be able to access the environment variables via process.env
as long as they only need a string value. This wouldn't work for any Cloudflare Bindings)
The other options for Users to access the request based environment variables is via Astro.locals.runtime.env
(reference). However this has the downside that it doesn't work in .js
or .ts
files.
So that means there is no "global" way for a Astro User to access env variables, because we handle the request handler as serverEntrypoint
and need to provide a way for users to access the env object which is a parameter of the request handler, we can decide how that should look like. (I don't know how other frameworks handle it, but we can explore that)
@florian-lefebvre is it a goal that you can use this package within your Astro config? Many users ask for the ability to use import.meta.env
in their Astro config but they cannot. Not saying this proposal needs to solve this problem, just asking.
No this is out of scope, actually I didn't even know people wanted that!
No this is out of scope, actually I didn't even know people wanted that!
Yeah it's something that a lot of people started to do since JS config files started to be a thing: ability to tweak it based on environment variables.
I would personally discourage this kind of pattern
I very good piece of context, why it also makes sense for Cloudflare to have request based env variables:
Alright so time to recap!
import { defineConfig, envField } from "astro/config"
export default defineConfig({
experimental: {
env: {
staticSchema: {
FOO: envField.string(),
BAR: envField.string({
optional: false
})
},
dynamicSchema: {
// Same
}
}
}
})
envField.string
)envField.boolean
). Just keeping things simple for nowupdateConfig
. I know it was not planned to be tackled initially but I think it's an elegant way to tackle it with no effortenvField
does not use zod because we want to only support a subset BUT we'll use zod under the hoodI'm still not familiar with this API but it will probably expose a dedicated endpoint to resolve dynamic env. We'll have to figure out how to make it work with cloudflare since env
is tied to the request.
import { FOO, BAR } from "astro:env/static"
It uses import.meta.env
under the hood so client access should be safe for sensitive data
---
Astro.env.FOO
Astro.env.BAR
---
This allows any adapter to work. However, it would still be nice to support a virtual import for usage outside of astro files. This requires AsyncLocalStorage so for example, it will require explicit opt-in for cloudflare through the compat flag:
import { getEnv } from "astro:env/dynamic"
I need your help on this precise topics:
envField
? Just to clarify, envField.string(options)
returns { type: "string", ...options }
under the hoodI plan to implement stuff little by little as it will be easier to work on and to review:
staticSchema
in the astro config, astro:env/static
and validation in a vite plugin)envField
dynamicSchema
in the astro config, adapter API and Astro.env
)astro:env/dynamic
using ALSSince environment variables are always strings what happens if you do envField.number()
? With Zod that would throw. What we really want in this case is to cast it, right? Assuming so, what happens if we make it cast, but it's NaN? In other words, is there special handling for all of these built-in types?
I don't really see the reason for the static vs. dynamic split in the config (or in the modules). What do we gain from that?
Since environment variables are always strings what happens if you do envField.number()? With Zod that would throw. What we really want in this case is to cast it, right? Assuming so, what happens if we make it cast, but it's NaN? In other words, is there special handling for all of these built-in types?
Under the hood, that would be z.coerce.number()
. I think a case where such an abstraction shines is for booleans:
// under the hood
z
.string()
// only allow "true" or "false"
.refine((s) => s === "true" || s === "false")
// transform to boolean
.transform((s) => s === "true"),
I don't really see the reason for the static vs. dynamic split in the config (or in the modules). What do we gain from that?
Maybe the benefit is to make sure people won't use the dynamic way for static variables? I'm fine with using one schema
tho
Since environment variables are always strings
@matthewp That's not 100% true for Cloudflare 👀
And it's really important that we don't have that limitation, because for a Cloudflare Binding the env has to have different types, even function
or object
. We should not coerce them!
I forgot about this little detail... So adapters should be able to override how zod works under the hood and possibly extend them? But that's not true for static variables right?
That is only relevant for runtime variables which you set via Cloudflare Dashboard or wrangler.toml
.
I'm not sure about supporting functions and objects, etc. The goal of the RFC says:
Provide a fully type-safe experience for environment variables, without manual type definitions
Environment variables can't be functions or objects. They are always strings that can be cast to so something else. So casting a JSON string to an object seems fine. But functions doesn't make sense to me.
Also, these things won't exist in dev mode, so I would expect them to throw unless marked as optional.
I would like to see the type that's returned by envField.string()
. If there's a desire to extend this type then it needs to be well-defined.
This is just syntactic sugar for a discriminated union:
{
type: "string",
optional: false
// ...
}
I just like this API à la Keystatic
I also don't see the value of the distinction between static and dynamic; in fact, I think it would make things confusing. I think we should start really simple.
Regarding the consumption of those environment variables, I think we should provide only getEnv
. I am not really sure about import { FOO } from "astro:env"
, I prefer to have the environment variables protected via a function.
I know that in pure ESM land doing this is illegal:
import { CONST } from "./module.js";
CONST = "some other value"
But astro:env
is a virtual module that gets in-lined to pure source code, so I am afraid a user might be able to override an environment variable.
Awesome work! I quite like the direction for the validation.
I don't mind the separation between static and dynamic fields. That makes it clear what environment variables can be dynamically set in the deployment platform. If we combine it, maybe we could also support something like envField.static()
and envField.dynamic()
?
I also think that we should limit the env values to primitives like strings and numbers for now.
Also, do we want to tackle prevention of leaking sensitive environment variables? Unlike SvelteKit with its separate dimension of public/private, I think in Astro we can assume static = public and dynamic = private. And we make sure that dynamic env vars are not used in the client side.
Lastly, I'm a bit concerned with how dynamic env var will be implemented. Would we have to do the validation in runtime instead? And does that brings in a lot of dependency?
About separating static/dynamic, here is what I shared at the end of the cloudflare env thread:
Actually, reading this convo reminded me why I think we need 1 schema per env variables type (static/dynamic). In the case of a cloudflare binding, that's a runtime thing but people would be able to try to access it through astro:env/static, and that would fail. Maybe there's a more elegant way than having 2 schemas, like adding a property on each field
I think Bjorn's idea is pretty elegant: we could have something like this:
// or envField.dynamic
envField.static({
type: "string",
optional: false
})
// OR the opposite
envField.string({
context: "static", // "dynamic"
optional: false
})
I don't mind having getEnv
for static usage as well. We just need to see if there's a single getEnv
for both static and dynamic values.
Regarding validation of dynamic env, I think it will be done at runtime (when calling getEnv
, maybe we have a way to add caching within the same request) using zod
since it's already included in astro. It should not require any other library afaik
Also, do we want to tackle prevention of leaking sensitive environment variables? Unlike SvelteKit with its separate dimension of public/private, I think in Astro we can assume static = public and dynamic = private. And we make sure that dynamic env vars are not used in the client side.
I think static variables are already protected because they'll use import.meta.env
under the hood. I don't know how we should do it for dynamic variables tho. I guess we could ship a different virtual module on the client that doesn't even import sensitive data, idk
Regarding validation of dynamic env, I think it will be done at runtime (when calling
getEnv
, maybe we have a way to add caching within the same request) usingzod
since it's already included in astro. It should not require any other library afaik
I'm not sure if zod
is part of the bundle, IIRC it's only used during build-time, so if someone use dynamic env vars, that includes around ~50kb of code to the bundle. There's also the handling to serialize the dynamic schema from astro.config.js
to the bundle runtime.
I don't know how we should do it for dynamic variables tho. I guess we could ship a different virtual module on the client that doesn't even import sensitive data, idk
If it's a virtual module, it should be simple to check so, example. And we could decide to stub/warn/error something.
If zod is problematic, I think it's fine to do some validation ourselves. I mean, given that we are going to only support a tiny subset of zod's api, it will be easy (and lightweight) to implement
Of the APIs I've seen presented, I'd prefer an extension function like envField.static(...)
and envField.dynamic(...)
. These match the type conventions for Astro DB columns.
That said, I'm not very comfortable with the "static" vs. "dynamic" naming. I haven't seen this convention in other frameworks, and it isn't immediately clear as a user what these names mean. I also noticed Bjorn mention this above:
Unlike SvelteKit with its separate dimension of public/private, I think in Astro we can assume static = public and dynamic = private.
If this is the case, would it make sense to use a more standard convention like envField.variable(...)
and envField.secret(...)
, or envField.public(...)
and envField.private(...)
? Having used platforms like Vercel, GitHub, and Cloudflare, I understand the difference between variables and secrets as a user. The encoding mechanism (static vs. dynamic) becomes an implementation detail users don't have to think about.
I don't think saying "static = public and dynamic = private" is true. If I understand correctly, we currently have:
Build time | Runtime | |
---|---|---|
Public | import.meta.env.PUBLIC_FOO |
process.env.* |
Private | import.meta.env.FOO |
N/A |
I think public/private could still be kept defined by the PUBLIC_
prefix, but having a way to distinct build time / runtime variables is really important imo. I think that's the nice part about sveltekit, they have clear imports for each of these 4 cases (see docs).
(I think you have runtime-public and runtime-private flipped)
I don't think supporting runtime-public makes a lot of sense. For buildtime-private, yeah I was thinking that maybe we don't support it and force users to runtime-private instead. If it was sensitive, maybe it shouldn't be inlined in the first place. And that users will always have to set the private env var on their deployment platform.
If we still want to support it, I also don't mind, just that I would prefer a chain API like:
env.static().optional() // implicitly private by default?
env.static().public().default("")
env.dynamic().private().optional()
env.dynamic().public().optional()
If the API is like envField.static({ type: "string", optional: false })
, we could directly accept an object instead? { type: "string", scope: "static", optional: false }
I quite like this API tbh! I don't especially mind supporting or not public dynamic variables either
I see a few issues with chaining tho. What happens if we have this case?
schema: {
FOO: env.static().public()
}
Since it doesn't have the PUBLIC_
prefix, should we error and explain it's required? What about this case if we don't support public dynamic vars?
schema: {
PUBLIC_FOO: env.dynamic()
}
Should we error saying it can't have this prefix?
Yeah I think we should error in those two cases and treat it as a config validation error. We currently have zod validate the Astro config, and we could do a nested validation within it too.
What is the definition of static and dynamic? What do these terms mean (in the context of environment variables)?
Static means available at build time through import.meta.env
. Dynamic means available at runtime and depends on the adapter, eg process.env
or Deno.env.get()
I also struggled with those terms at the beginning, maybe we should refer to them as build/compile-time
vs runtime
variables?
Yeah it's a bit longer but way clearer
Good suggestion @alexanderniebuhr, I prefer this naming convention!
So to recap since last time I asked for feedback
How do we want to call the 2 types of env variables? I went for static/dynamic but I'm open to anything
Let's go for buildTime
/ runtime
How do we feel about having 2 distinct schemas for static/dynamic vars? What do you think about envField? Just to clarify, envField.string(options) returns { type: "string", ...options } under the hood
Instead of having 2 schemas, it will be part of envField
as suggested by Bjorn
What do you think about having 2 APIs for dynamic variables? I don't see any other possibilities to remain cloudflare-friendly
We are going to use ALS + virtual import only.
More questions now!
astro:env/static
. I still think it improves the DX (it's also done by sveltekit) but I'd like your opinions. If we don't do this, what should we do? I think we need 2 distinct imports anyway at least?PUBLIC_
prefix)?
Summary
This RFC aims to improve DX (and eventually security) around env variables.
Background & Motivation
Env variables are an important part of any web application. You need to store sensitive data (think API secrets, tokens etc) without being leaked inside your git repo. But that's only the 1st part of the story. It's easy to leak this data by importing in the wrong place, eg. the frontend like Resend a few weeks ago.
Other JS frameworks (eg. SvelteKit) are handling env pretty well. From my understanding, the env story is currently a bit tricky in Astro. According to the docs, here is how env variables are currently handled:
import.meta.env
import.meta.env
includes some default variables likeSSR
,BASE_URL
...PUBLIC_
import.meta.env
on the client side will beundefined
(value will be accessible server side).env
(or.env.production
,.env.development
) and CLIprocess.env
, or following the used runtime (eg.Deno.env.get()
for the deno adapter)process.env
has any protection against client-side usage (likeimport.meta.env
), but I guess it doesn'tGoals
Non-Goals