Open colinhacks opened 2 months ago
the mutable global config seems potentially dangerous
maybe im missing somehting but what is stopping some third party library who also depends on zod from calling z.configure
that and setting inputInclude to always in production
the mutable global config seems potentially dangerous maybe im missing somehting but what is stopping some third party library who also depends on zod from calling
z.configure
that and setting inputInclude to always in production
It's a reasonable concern, but this is just a fact of life with npm. The same malicious library could override the parse
method on the ZodType
prototype and send all the data to a remote server π€·ββοΈ The answer is always "vet your dependencies". Setting some global config is the least of the potential pitfalls here.
The only other configuration mechanism that comes to mind is some kind of "Zod factory" pattern:
import { Zod } from "zod";
const z = Zod.create({ /* ... */ })
Which I don't really like. For starters it's a breaking change for 100% of users. But more specifically, the z
variable has also always been a module, and it includes type utilities like z.infer
. There's no way to add z.infer
(a type alias) on a runtime object that's created with Zod.create()
(or equivalent). This would confuse a lot of people, I'd wager, and trying to explain this subtlety to casual TypeScript users doesn't sound fun.
I like this.
I agree with the mutable config concern - I think I'd prefer that z.configure()
being a factory that returns a new "root instance". Otherwise, depending if the file containing it is imported or not, you get different behavior
Hey π
Love this! π
How are arrays handled? should all items be included, or just the bad items (+index)? Maybe for objects, we could define how this behaves per key (and subkey)? use case is, include inputs for everything but PII/secrets in production..
I see how its not really possible to have a better solution given zod's architecture
I don't think saying just "vet your dependencies" is unreasonable
this is a good change π
How are arrays handled? should all items be included, or just the bad items (+index)?
If the issue originates from a ZodArray
schema, it'll contain the whole array. If there's an issue with a particular element, then the index will be specified as the last element in path
and input
will point to the offending element. This logic already exists in the ZodArray parser.
Maybe for objects, we could define how this behaves per key (and subkey)? use case is, include inputs for everything but PII/secrets in production..
I hadn't considered this, but for now I'll just say its out of scope for this RFC. Something like this will probably be achievable with Zod 4's updated error map API (haven't tweeted this one out yet but it's public π) by directly mutating the issue inside the error map.
const inputDeleter: z.ZodErrorMap = (iss)=>{ delete iss.input };
z.object({
public: z.string(),
secret: z.string({ error: inputDeleter })
})
If the issue originates from a ZodArray schema, it'll contain the whole array. If there's an issue with a particular element, then the index will be specified as the last element in path and input will point to the offending element. This logic already exists in the ZodArray parser.
Could we control it somehow (error map API?) not to contain the whole array? imagining big arrays being annoying in error logs/traces.
I hadn't considered this, but for now I'll just say its out of scope for this RFC. Something like this will probably be achievable with Zod 4's updated https://github.com/colinhacks/zod/pull/3618 API (haven't tweeted this one out yet but it's public π) by directly mutating the issue inside the error map.
Nice! wondering if a shortcut like .secret()
will make sense if this pattern is used a lot.
Just my 2 cents:
Zod is widely adopted by many projects and companies, often chosen for legal reasons and privacy concerns. If the default behavior changes to include input data in errors, simply because someone forgot to explicitly set NODE_ENV to production, it could have a significant impact on many users.
Errors get logged, and logs are often stored for extended periods. In the worst-case scenario, sensitive information could inadvertently be exposed. The z.configure method for active opt-in is the right approach, and the feature to enable more verbose errors is indeed helpful.
However, the environment variable approach does not align with the general behavior of Zod and the reasons why companies and critical projects choose Zod in the first place.
imagining big arrays being annoying in error logs/traces.
Keep in mind there aren't many issues that would originate from ZodArray
itselfβit basically only throws an error if you pass in a non-Array
, or if the length of the array is too big/small per your .min()/.max()
checks. All other issues will be associated with a particular index and won't get the whole array in input
.
Could we control it somehow (error map API?) not to contain the whole array?
You can delete input
entirely using the same error map as before. I don't think Zod will provide a mechanism for customizing what value gets assigned to issue.input
, that seems like overengineering.
If it turns out a lot of people are annoyed by this, we could do a few things:
input
non-enumerable.stringify()
method on ZodError
that would give you additional logging options (err.stringify({ includeInput: false })
)But keep in mind console.log
starts truncating after a certain depth anyway.
Zod is widely adopted by many projects and companies, often chosen for legal reasons and privacy concerns. If the default behavior changes to include input data in errors, simply because someone forgot to explicitly set NODE_ENV to production, it could have a significant impact on many users.
Its good to have a more conservative take in the discussion.
I hear what you're saying, but from a security perspective, the current proposal is not unreasonable in my view.
If you have serious security/auditing constraints for your application, the onus is on you to make your systems compliant, and I don't think setting NODE_ENV
properly is an unreasonable expectation. I'm trying to avoid a situation where Zod's DX suffers for everyone for the sake of a hypothetical medical app that hasn't set NODE_ENV
properly.
That said, I'm sympathetic to this concern. As you say, Zod is already widely adopted. Even if the NODE_ENV
thing is reasonable in a vacuum, it may not be workable for Zod. No one reads changelogs to the end, and this change may be unexpected for someone who expected Zod to continue eliding input data from ZodError
forever.
I think a decent middle ground is to set input
if and only if:
development
This means input
will exist for the 95+% of users who are using a framework that properly sets NODE_ENV=development
(which is most of them). Zod will also set input
in the browser, where the memory is already fully introspectable, and hiding input data is a bit of a moot point.
But it avoids the failure mode where an existing production service that hasn't set NODE_ENV
at all suddenly starts seeing input data in their logs. cc @sebastianwessel
If there's a security-sensitive application that has explicitly set NODE_ENV=development
in the production service, I can't say I'm wildly sympathetic to their plight.
@colinhacks Can this be a new parameter to a parse method instead?
@kibertoad agree with you, being able to set this on parse method would be nice, but I'd make it opt-out, not opt-in
@capaj Default value should be what is preferable for production.
@colinhacks Under this proposal, I think there would still be no way to get access to the "root" parsed object from a ZodError, correct?
I created this library called zod-error-viewer
that can be used to render nicely formatted errors: https://github.com/BenLorantfy/zod-error-viewer
However, in order to format errors like this, we need to ask the consumer to provide the data that was parsed:
import { ZodErrorViewer } from "zod-error-viewer";
<ZodErrorViewer
data={/* data that was parsed */}
error={/* ZodError that occurred */}
/>;
This is pretty inconvenient for the consumer, because they need to keep around a reference to the data that was parsed. This can add extra complexity to the consumer's code.
It would be convenient for the consumer if they could just pass the error:
import { ZodErrorViewer } from "zod-error-viewer";
<ZodErrorViewer
error={/* ZodError that occurred */}
/>;
However, this would require being able to access the originally parsed "root" data from the error
object.
Have you considered exposing the "root" parsed object on the error? (In addition, or instead of, adding input
to each ZodIssue
as mentioned in the RFC). Something like this:
class ZodError {
issues: ZodIssue[];
input?: unknown;
}
RFC: Include input data in
ZodIssue
Read the RFC π
General comments can go below. Leave specific comments in the Files changed interface.
Acknowledgement
My full-time work on Zod 4, including the design & implementation of this proposal, is supported by Clerk.