colinhacks / zod

TypeScript-first schema validation with static type inference
https://zod.dev
MIT License
31.78k stars 1.1k forks source link

Add support for jwt #2946

Open aarontravass opened 8 months ago

aarontravass commented 8 months ago

Add support to verify jwt strings. For example

const jwtSchema = zod.string().jwt()
jwtSchema.parse()
sarathkumarsasi commented 7 months ago

Can I work on this feature?

abdirahmn1 commented 7 months ago

+1

m10rten commented 4 months ago

So a JWT: JWT Is a JSON object with: Signature, Header and Payload. To verify if it is a JWT, you can check against the headers/signature, but without a secret you wont be able to verify if the payload.

possibly this would be a nice feature:

z.string().jwt({secret: process.env.MY_JWT_SECRET}, description: "this is a users JWT"})
  .payload(z.object({...});

Resulting in a z.object with your defined properties, same functions like .safeParse, .parse, etc.

Something like a z.string().transform(v => !!v): boolean, but for a JWT, from a string:

Following example usage (what I think would be best)

/**
*   "sub": "1234567890",
*   "name": "John Doe",
*   "iat": 1516239022
*/
const jwt = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c" // from JWT.io, no private info here.

const schema = z.string().jwt() // no secret, public JWT.
  .payload(
    z.object({
      sub: z.string(),
      name: z.string(),
      iat: z.number()
    })
  );

No .payload would resolve to unknown.

Hope this helps in development!

nikelborm commented 4 months ago

from https://datatracker.ietf.org/doc/html/rfc7519#section-2

JWT Claims Set A JSON object that contains the claims conveyed by the JWT.

Based on that, there is no real need for specifying that argument of .payload is z.object. Also I think .withPayload easier to read. Here is an example

const schema = z.string()
  .jwt({
    secret: process.env.MY_JWT_SECRET, 
    description: "this is a users JWT"
  })
  .withPayload({
    sub: z.string(),
    name: z.string(),
    iat: z.number()
  });
aarontravass commented 4 months ago

from https://datatracker.ietf.org/doc/html/rfc7519#section-2

JWT Claims Set A JSON object that contains the claims conveyed by the JWT.

Based on that, there is no real need for specifying that argument of .payload is z.object. Also I think .withPayload easier to read. Here is an example

const schema = z.string()
  .jwt({
    secret: process.env.MY_JWT_SECRET, 
    description: "this is a users JWT"
  })
  .withPayload({
    sub: z.string(),
    name: z.string(),
    iat: z.number()
  });

Maybe the .jwt function options should have ignoreExpiration as an option to ignore timestamp verification. This was done in jsonwebtoken

`ignoreExpiration`: if `true` do not validate the expiration of the token.
m10rten commented 4 months ago

Agree with expire check false option, +1.

Disagree with the .withPayload, reason: you are already defining a .jwt, if you say: this jwt has a payload of this shape, you should just say .payload, with .withPayload it would feel like: "this jwt with a payload with payload in shape", when with .payload it would be more like: "This is a jwt and its payload is shape."

Maybe it is a personal prefference, but I can safely say most devs prefer simple and conventions in naming, since zod is all about: .string .email .min .max .number .regex, it would not make sense to include with, instead just .payload.

Let me know what you think!

aarontravass commented 4 months ago

Agree with @m10rten

Zod is a validation library and has nothing to do with Authorization/Authentication.

m10rten commented 4 months ago

Agree with @m10rten

Zod is a validation library and has nothing to do with Authorization/Authentication.

Yess, focus should be on validating a valid JWT and get data from the payload.

So plain and simple .jwt with return unknown or .payload;

nikelborm commented 4 months ago

@m10rten JWT is still a string that consists of 3 parts divided by . symbol:

  1. JOSE Header which is base64 encoded string of json representation of an object that has standardized structure: https://datatracker.ietf.org/doc/html/rfc7519#section-5
  2. Payload which is base64 encoded string of json representation of an object that has partly standardized structure and is allowed to be extended: https://datatracker.ietf.org/doc/html/rfc7519#section-4.1
  3. signature which is either JWS signature (https://www.rfc-editor.org/info/rfc7515) or JWE signature (https://www.rfc-editor.org/info/rfc7516)

So in case you read .string().jwt() as this jwt with a payload you should really read it like this is jwt string with JOSE Header which is base64 encoded string of json representation of an object that has standardized structure also with Payload which is base64 encoded string of json representation of an object that has partly standardized structure and is allowed to be extended and also with signature which is either JWS or JWE

Writing .withPayload (meaning with payload of shape) instead of .payload doesn't suddenly turn zod into Authorization/Authentication library even in the slight degree. But ignoreExpiration actually does because it touches business logic of those tokens, not just DTO-logic of those tokens. Also there should be no secret parameter in signature .jwt({secret: string}) if you against turning zod into Authorization/Authentication library. Because signature algorithms requiring secret calculates, well, signature. And this task can be done in dozens and dozens of ways.

  1. Asyncly or syncly (in terms of promises in js as thread blocking or non-blocking calculations).
  2. using different algorithm implementations
    1. slow purely js algorithms implementations to make their code platform independent and node-version independent
    2. fast inlined c++, rust libraries etc
    3. standard crypto API to take best from both worlds
  3. using different kinds of algorithms
    1. symmetric encryption for validation only needs SECRET as additional salt
    2. asymmetric encryption for validation needs PUBLIC KEY which not at all secret and intended to be shared
  4. etc...

One part of the token such as jose header very much affects other part of jwt token such as signature at the end. That's well beyond data encoding and data structure validation and should not be considered.

And I'm not even telling you that JWTs can be validated by many libraries in JS, not only one. And if you decide to support signature validation you can't just force people to use specific library for their business validation like

  1. checking if signature is valid
  2. checking if token was issued for specific auditory
  3. checking if token should be used before specific date and time
  4. etc...

You can't because it will force developers who use zod to use only specific set of supported encryption algorithms which is the main difference between those jwt libraries. The same as you can't decide to support every jwt library by forcing developers of zod to do so.

.string().jwt() should be allowed to be called with composable functions like .withPayload or .withJoseheader. .withJoseheader may be used for example if developer wants their jwt to have only specific algorithms or anything except 'none' algorithm.

Also .string().jwt() same as .string().jwt().payload() should not perform any transformations and should not return either {joseHeader: Record<'alg'|'typ'|'cty', something>, payload: Record<string, any>, signature, signature: string} or Record<string, any> as underlying payload. There is .transform for those who wants it. I'm saying that because .payload() underlies that it will return only .payload() and this is bad design because we will have no way to get JOSE header because it will be simply stripped.

JWT by itself is not plain and simple and if you want plain and simple .jwt() that does only shallow validation you should

  1. reject most of the logic of JWTs as tokens
  2. leave only the logic of them as data-structures
  3. leave an ability for developers to choose what business logic they want to put on top. And there is awesome tool for that in zod called .refine()
m10rten commented 4 months ago

Yess, so maybe its return type should be something like this:

interface ReturnValue<ZodAny extends z.ZodTypeAny> {
  headers: Record<HeaderTyped, string>;
  signature: {
    alg: string;
    typ: "JWT"
  };
  payload: ZodAny;
};

Where a .payload would set the schema for that payload property. Otherwise it would be unknown.

nikelborm commented 4 months ago

There are no methods in zod that change the type of what was passed to them. You change the type of value from string to object. For those who wants to change the value there is transform. It is not the responsibility of zod to transform the data. It is the responsibility of the developer who use zod. For example I may want to return from my 'refreshAuthTokens' endpoint json like this

{
"accessToken": "base64_of_json_of_jose_header.base64_of_json_of_payload.base64_of_binary_signature",
"refreshToken": "base64_of_json_of_jose_header.base64_of_json_of_payload.base64_of_binary_signature"
}

And on the client after validation I may want to store it somewhere as is. As string. And I as a developer may choose to use any js library there is to validate the token (not only payload) using any algorithm implementation I want. Returning ReturnValue<ZodAny extends z.ZodTypeAny> doesn't leave a chance for me to pass my jwt to validation library specialized to handle JWTs.

And by the way your signature should have type string and your headers should have type { alg: string; typ: "JWT"}. You mistaken one for another. Typescript type for headers (you meant it as { alg: string; typ: "JWT"}) is also incomplete because JOSE headers can have more than only 'alg' and 'typ' keys.

m10rten commented 4 months ago

Is it changing or validating. You get a schema that takes in a jwt string and validates the payload.

It is not transforming if you have jwt, else why would you event implement a .jwt method?

nikelborm commented 4 months ago

It DOES transform from string to object. Your interface is not JWT. It is your personal representation of that JWT. And developers will not be able to pass this object for further business-logic validation to any other library that handles JWTs. Because all of the libraries parse string and return their own interfaces representing this JWT string contents. One library may want to call field JOSE_headers when other may decide to call it just headers. Object returned from those libraries IS NOT jwt already by RFC definition. The task of parsing JWT is not simple either because if you look into the specifications of JWT you will find many other ways of token representation such as nested JWTs you haven't heard before. There are 2 things wrong with transforming JWT string to an object:

  1. It either done by Zod, which means that all kinds of different logic to support entire JWT standard is written inside Zod and many working hours of devs are spent uselessly, when there are many more important things to be done.
  2. Or you connect some library as a dependency. Doing so you force users to use that library and forbid them of using .jwt() in zod with specific library and encryption algorithm they chosen themselves (because library that will be used as a dependency of zod may not support desired algorithm)
m10rten commented 4 months ago

I believe I am following you, but is this really an issue?

Lets add an example:

import { z }  from "zod";

// both of these are on a string, check if its in a format, and then return that string.
const email = z.string().email();
// so .jwt follows the .email
const jwt = z.string().jwt();

// all you would do with a .payload is indeed a .transform with a lot of steps taken out.
const jwtWithPayloadSchema = z.string()
  .jwt()
  .payload(
    z.object({...})
  );

// so where .jwt should just check if it has these 3 elements: signature, headers and payload.
// the .payload adds a transform layer with a validation on the jwt data.
const userJwt = z.string()
  .jwt()
  .payload(
    z.object({
      userId: z.string(),
    })
  );

const parsed = userJwt.parse(unknownData);
// ^: {userId: string}, or if you want to also strip out the other parts: {signature: ..., headers: ..., payload: {userId: string}}

I think this way can not hurt users in their way of using a JWT. I imagine the same for .headers and a .signature on the .jwt function.

josh-i386g commented 3 months ago

2024 and still no jwt regex :(.

m10rten commented 3 months ago

@colinhacks Can we get your opinion on this?

It seems jwt support would be very usefull (validating the format not the payload).

danilomourelle commented 2 months ago

Guys, I think validation of a .jwt() should be on JWT format only - a string, with three parts, in base64 format....

I don't think Zod should be responsable for receiving and checking signature, expiration or anything else. Also for payload, for me, it just make sense to check after I already know the string is a valid Jwt (checked by zod), and has valid signature (checked by jose, jsonwebtoken...), then I would check payload, with another regular object zod schema. I believe that checking and returning the payload without checking the signature (and zod shouldn't check for signature), may lead to the use of this payload if it just respect the schema, what would be a safety fail.

I already open a PR following this line in August, should I do anything else to make it merged?

aarontravass commented 2 months ago

@danilomourelle Looks like you need to ping/contact @colinhacks or anyone else with merge privilege to merge your PR.

mark-P0 commented 1 week ago

Seems like a related PR for this was merged to v4. Looking forward to it