colinhacks / zod

TypeScript-first schema validation with static type inference
https://zod.dev
MIT License
34.3k stars 1.2k forks source link

Add `sensitive` to `ZodSchema` #1783

Open kriswuollett opened 1 year ago

kriswuollett commented 1 year ago

I noticed in #595 the discussion regarding sensitive values. It would be great to add the ZodError to structured logging like through pino so that things like path can be directly searched to find / deduplicate issues in logs, but sensitive values can get logged due to the received field as well as some of the localized messages:

https://github.com/colinhacks/zod/blob/9828837fb94f1500ef362b20ca5fe35eed1b6d0e/src/locales/en.ts#L7-L13

Perhaps one could write code like the following to make sure received is undefined by adding a sensitive: boolean to the appropriate Zod types, as well as the localized messages appropriately not interpolating the sensitive value:

const databasePassword = z.sensitive.string().min(1).safeParse(process.env.MY_DATABASE_PASSWORD);
if (!databasePassword.success) {
  log.error({
    err: databasePassword.error // all `received` fields are not set
  }, 'Invalid database password: %s', databasePassword.error.message);
  return null;
}

Useful for credentials like passwords, and PII like email. Redaction in pino cannot be used in a workaround in all cases due to https://github.com/pinojs/pino/issues/1612.

santosmarco-caribou commented 1 year ago

@kriswuollett Thanks for submitting this suggestion.

I think that you can achieve this by defining a custom error map on the schema. By doing that, you can clear whatever message would get logged and replace it by any other string/combination of characters of your choice (e.g., ***).

Do you want help with setting this up?

kriswuollett commented 1 year ago

Thanks for pointing out the custom error map -- I did not notice that. I think I can figure out from the docs how to suppress info in the error message.

I guess I'm just looking for a generic way to add tags/labels to values for various reasons.

I was just looking into transform now to see if I could override the type, but I'm guessing that if a validation failed before the transform, then the received value would still contain sensitive info.

I'm thinking my best bet would have anyways been to transform (the Zod) errors into a shared generic type before logging.

Feel free to close the issue if you don't think it is needed for zod, but if you'd like to see another implementation for inspiration, check out secret types in Pydantic.

canassa commented 3 months ago

In my scenario, I need certain attributes to be redacted not just when generating error messages, but also during a successful parse.

This is similar to how Pydantic handles the Secret type:

from pydantic import BaseModel, SecretBytes, SecretStr, field_serializer

class Model(BaseModel):
    password: SecretStr
    password_bytes: SecretBytes

    @field_serializer('password', 'password_bytes', when_used='json')
    def dump_secret(self, v):
        return v.get_secret_value()

model = Model(password='IAmSensitive', password_bytes=b'IAmSensitiveBytes')
print(model)
#> password=SecretStr('**********') password_bytes=SecretBytes(b'**********')

I understand that the creator of Zod is strongly opposed to passing any context to the parse function, but that creates challenges for cases like this.

I came up with this hack. It ain't pretty but it seems to work:

import { z } from 'zod'

let loggingFlag = false

/**
 * A zod string that will be redacted when logging
 */
export const sensitiveString = z.string().transform((a) => (loggingFlag ? '[REDACTED]' : a))

export function parseSensitive<TSchema extends z.ZodTypeAny>(
    schema: TSchema,
    data: unknown,
): z.infer<TSchema> {
    try {
        loggingFlag = true
        return schema.parse(data)
    } finally {
        loggingFlag = false
    }
}