lukeautry / tsoa

Build OpenAPI-compliant REST APIs using TypeScript and Node
MIT License
3.57k stars 503 forks source link

custom transformers / validators for input data #496

Closed simllll closed 4 years ago

simllll commented 5 years ago

Sorting

Issues

Right now it's not possible to add a custom validator / transform for the input that comes to a controller method. E.g. if you have a custom type (like we were discussing in #483) it's not possible to validate this input data. Or even if you have just a string that you want to validate against something. E.g. if it is a valid ID or something else (more complex) that can not be covered by typescript definitions.

Possible Solution

Allow adding custom validators/transforms for input data. E.g. like strings are converted to numbers right now, I would like the ability to add a custom tranformation for a specifiy key, but this transformation can also fail and will be handled like the current validation erros

I'm thinking about a second parameter to the @ Body or @ Query annotation where you can pass a custom function. This function will override any other default validator/transformator for this parameter, but the function get's a reference to the "default" validator so it can also be used in the custom transform/validator function.

This is just a draft of the idea. All comments welcomed.

Breaking change?

no breaking change

WoH commented 5 years ago

I feel like this is something that should be done in userland.

I don't (yet) see the value of not casting in the controller, but adding a new mechanism and related complexity to the framework?

simllll commented 5 years ago

For my understanding I though the validators are acutally doing exactly this, but right now only for "pre defined" types.

E.g. If you look into the validatorservice https://github.com/lukeautry/tsoa/blob/a88de0f34eee7cd52d1a87b62fcf2144f6e6837d/src/routeGeneration/templateHelpers.ts#L161 here the input (which is by nature a string in the case of a query param) is transformed to an integer [transformer] (if it can be transformed to an integer [validation]). After the validationservice has passed all attributes are validated and returned / transformed to their corresponding type.

The custom validator would just extend this functionallity to do more flexible validations / transformations.

Another example: We have one endpoint which gets multipart/form-data as input, to parse this kind of data we use the multer.none() method. Unfortnautnely this method returns a different object than a plain js object. Therefore all validation fails afterwards, a simple transformation like this: req.body = { ...req.body }; solves the problem already. But right now we have insterted a custom route entry that is run before registerRoute is called which 1. calls multer (this will be solved by the middleware annotiation) 2. does the custom tranformation.

This is pretty ugly in many ways, as it takes place on a completely different file, and is hard to understand.

e.g.

app.post('/some-multipart-endpoint', multer().none(), (req, res, next) => {
    req.body = { ...req.body };
    next();
});

RegisterRoutes(app);
dgreene1 commented 5 years ago

@simllll how would a custom transformer avoid multer? Like if someone uses bodyparser as a middleware in koa, then every incoming request will be parsed as JSON.

Is my question clear? (I'm struggling to phrase this)

I guess to put it another way, aren't most middleware functions applied across all routes?

So even if we provided some hook that allowed consumers to serialize the raw http string into a new object, wouldn't we still get hit by the middleware that the person includes first?

simllll commented 5 years ago

Yeah sorry that I mixed that in my example, I thought it was necessary to see the whole problem and what can be solved with this idea.

Multer can not be avoided with this approach at all, there is probably always some kind of middleware. Multer has the only difference to any average middleware function that it is only applied to certain routes with custom parameter (e.g. what's the name of the field for the file upload). Multer also throws some errors if the input is not as expected and therefore can not be applied as regular middleware function for all routes. Anyway, this will be not solved by this, absolutely right, but it would be solved with #62 .

The problem that is still left, is that some transformations need to be done to get the output of the middleware to be compatible with tsoa. You are right in this case you could also just add a custom middleware function that processes multer and does the transformation after it immediately. But this is just one case, there are more cases where something is in the body that needs to be transformed and validated before tsoa or the controller afterwards can work with it. This is already happening e.g. for integer parsing, custom objects etc. But not possible for e.g. custom types or "complex" type aliases because the validation is only happening on build time within typescript (and therefore not even possible for tsoa to do an "automatic" input validation on runtime).

Do you see what I try to explain here? ;)

WoH commented 5 years ago

For my understanding I though the validators are actually doing exactly this, right now only for "pre-defined" types.

These are specified by JSON Schema and can be expressed in Swagger/OpenAPI documents. Right now I see the job of the tsoa validation layer to check if the input matches the spec. If it does, pass it to the controller. Basically, a user of your API should know why it did error based on the swagger spec.

The tuple pr is a great example here: We can't express the order of the types in the API spec, therefore, we could throw a validation error to satisfy the Controller's (Typescript) requirements, but you would not know if should error from only looking at the API Spec.

I'm not saying this is a bad idea, but I feel like supporting that could be a lot of work.

github-actions[bot] commented 5 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

Eoksni commented 4 years ago

I don't relate to the multer example, but my thoughts on the topic is that doing such transformations and validations in userland has a drawback: it can't affect generated spec. I'd like to specify some custom format, for example parenthesized strings:

interface RequestBodyDto {
  /**
   * @format parenthesized
   */
  someKey: string;
}

So I would accept { someKey: "(hello)" } and reject { someKey: "hello" }. Then lets say I hate my customers and I dont actually need those parentheses and I want to remove them as transformation step while still forcing consumers to send those parentheses. For whatever reason. So my controller will expect { someKey: "(hello)" } to be transformed to { someKey: "hello" }.

A very nice way to achieve all that would be somehow telling tsoa about implementation of such custom format by providing something like this:

// parenthesized-formatter.ts

@Formatter('parenthesized') // not-yet-existing decorator
export class ParenthesizedFormatter {
  validate(input: unknown): unknown {
    // check if valid
    if (isString(input) && input.startsWith('(') && input.endsWith(')')) {
      // and apply some transformation
      return input.substring(1, input.length - 2);
    } else {
      throw ValidateError(...);
    }
  }
}

And pointing to such files in tsoa.json:

{
  // ...
  "customFormatterGlobs": ["*-formatter.ts"]
}

Then tsoa would apply such validation/transformation and our controllers would get clean data, AND also output a format property in the spec (it allows for custom formats, at least openapi 3).

This would conform to the tsoa philosophy of "specify in one place, reuse it everywhere" (at least thats how I understand tsoa philosophy).

Doing all that in userland means we still have to specify jsdoc @format ... in order for it to get passed to spec file, but we also need to remember that we specified that format and apply our validation. And everything that requires "remembering" can always get out of sync, which is not great.

I believe it shouldn't be very hard to implement, I can do it myself if you are willing to accept such PR.

P. S. Even better if such custom formats would accept parameters, like * @format parenthesized 2 and it would enforce 2 parentheses like "((hello))". Such parameters could be passed to the constructor of the custom formatter class.

dgreene1 commented 4 years ago

@Eoksni sounds interesting but it doesn’t seem like a real scenario. If anyone else can provide a realistic scenario that would be helpful.

For instance, I would like my routes to expect a JsJoda LocalDate. In order to do that I would need either need:

For now I find Option B to be easy enough and also more transparent. And thanks to tsoa’s existing isDay formatter I can still specify the type of the string in Swagger.

In other words, I haven’t needed Option A yet.

Eoksni commented 4 years ago

@dgreene1 I partly agree, conversion isn't that big of a deal. On other hand, custom validation is probably a big deal - tsoa provides a validation layer (only for built-in types/formats), but have you ever seen a validation library without support for custom validation?

Most ORMs have support for custom validation and it would be great if we can "lift" those validations from model-level to tsoa/DTO-level so it can fail as early as possible if the request is not in a correct format, while also providing additional and actual documentation information about that format to the consumer.

As for more concrete realistic scenarios

However, format is an open value, so you can use any formats, even not those defined by the OpenAPI Specification, such as:
email
uuid
uri
hostname
ipv4
ipv6
and others

Dunno, maybe a better long-term solution is to delegate validation capabilities to some external popular validation library? Just as a suggestion.

dgreene1 commented 4 years ago

I hear you but the challenge is how would the formatter/validation describe the inputs versus outputs. And I’m particular concerned with this for query parameters.

I have to think on this. Let’s let it sit to see if others would also like this. It bloats the library. I’m not against maintaining this addition but I’d like to see what the community desires since there is a very clear workaround that meets your needs of being able to communicate the format of the string via OpenAPI.

Again, I’m not saying no, I’m just saying let’s wait a bit. For anyone reading this please feel free to comment below with your interest or concerns.

WoH commented 4 years ago

validations in userland has a drawback: it can't affect generated spec.

That's precisely the point why it would be bad to add it. I'll give a brief example. Given a controller method

  @Get('/blog')
  public async getBlog(
    @Header() h1: string,
    @Header() h2: string,

we need to be able to look at this and expect to get h1 and h2 from the request (and therefore inject and document h1 and h2). If we had an @Middleware(myMiddleware), that was applied, it may be that middleware that sets those headers and therefore should not be part of the request spec. I don't see any way we could predict that.

So what you can do, basically the escape mechanism, is to grab the request object, which usually is a bad option (when you can do the validation inside the controller method) after applying the middleware and accessing the properties the middleware sets on the request via request.propertyFromMiddleware.

I feel like the primary thing tsoa should do is enforcing OpenAPI on top of TypeScript. See below:

So I would accept { someKey: "(hello)" } and reject { someKey: "hello" }. Then lets say I hate my customers and I dont actually need those parentheses and I want to remove them as transformation step while still forcing consumers to send those parentheses. For whatever reason. So my controller will expect { someKey: "(hello)" } to be transformed to { someKey: "hello" }.

Perfect case of a simple reusable function that takes one parameter and returns a boolean. You could even wrap that in your own decorator for function decoration, I guess my issue really is that I don't see the benefit of moving this into the framework? In case you can't use @format to document a custom format, that should be a PR.

but have you ever seen a validation library without support for custom validation?

Dunno, maybe a better long-term solution is to delegate validation capabilities to some external popular validation library? Just as a suggestion.

Tsoa uses a validation lib (https://github.com/validatorjs/validator.js) for most of the basic building blocks. It provides simple functions for validating (string) inputs. There is no "extension" mechanism since it's not required and probably would not make sense.

However, I certainly see the point of moving validation into a json-schema validation lib, I have entertained the idea of rewriting the validation layer as a wrapper around using AJV at various points in time (esp. since our swagger probably 95% compatible with json-schema and therefore AJV). I think it makes sense at that point to discuss providing hooks into the custom keyword functionality ajv offers.

dgreene1 commented 4 years ago

We already decided against using ajv or joi internally when we discussed issue https://github.com/lukeautry/tsoa/issues/416.

However, nothing precludes a tsoa User from using a validation library inside of the route method.

Also, I feel like tsoa is flexible enough to allow a developer to document any and all of the string formats. I guess we just haven’t documented that well.

WoH commented 4 years ago

We already decided against using ajv or joi internally when we discussed issue #416.

I initially closed this because luke was talking about an unreleased version of this which I wanted to take a look at first.

However, nothing precludes a tsoa User from using a validation library inside of the route method.

Or replacing the generated output/validation through different templates and sharing these via npm.

Eoksni commented 4 years ago

Yeah, I believe using a custom template is the simplest solution for me at this point.

@WoH Still, I think there is a bit of misunderstanding.

Perfect case of a simple reusable function that takes one parameter and returns a boolean. You could even wrap that in your own decorator for function decoration, I guess my issue really is > that I don't see the benefit of moving this into the framework?

So I tried to explain it earlier, but I guess I wasn't successful at it. So I take another try :) Once again, my suggestion is that specifying custom @format should enable custom validation at runtime. I don't talk about whole middleware that can inject additional properties into headers or whatever. No, my point is about this statement about tsoa philosophy from the readme:

Runtime validation of tsoa should behave as closely as possible to the specifications that the generated Swagger/OpenAPI schema describes.

Currently, if I specify that some property in the request body is string, tsoa ensures that it is string. If I specify that some property is number, tsoa ensures that it is number. And so on. That is AWESOME! But why not bring some customization into it? Openapi spec allows us to specify custom format, so why can't we specify custom validation according to such custom format?

Adding "userland" validation is pretty much the same as not using tsoa at all. I might as well write spec myself from scratch, specify there my custom format and then add my custom validation decorator to my controller. But it is errorprone, I can forget to update my docs very easily when I change my runtime validation. Awesomeness of tsoa is that it ensures the generated spec to be up-to-date because of the single source of truth - I specify that my model expects someKey property of type string and BOOM! Runtime validation is in place. That is great, and doing same thing for custom formats/types would be even greater thing.

WoH commented 4 years ago

So I tried to explain it earlier, but I guess I wasn't successful at it. So I take another try :) Once again, my suggestion is that specifying custom @format should enable custom validation at runtime. I don't talk about whole middleware that can inject additional properties into headers or whatever.

I somewhat agree here and agree that it wasn't smart to refer to middleware and input validation in the same paragraph in my comment, which happened due to the context of the conversation with how broad the initial issue discussed in here was. I might have generalized a bit too much. However, I referred to your custom format explicitly:

In case you can't use @format to document a custom format, that should be a PR.

So let's first of all agree @format myformat should work. The only thing beyond that I could see is something you mentioned:

Openapi spec allows us to specify custom format, so why can't we specify custom validation according to such custom format?

I would agree that a hook for format validation (as a specific case of custom validation that's supported by the spec) might be useful.

Eoksni commented 4 years ago

I believe I can use @format myformat to have it show up in the generated spec. But specifying it in the spec doesn't mean it introduces any actual validation. So if I could specify a custom validation for my custom format, I'd be the happiest person on Earth :)

I'll start by implementing my custom template to fiddle with the validation process, and if I see something good coming out of it, I can share and maybe we can come up with something great eventually. Thanks for the discussion and the responses, its been a pleasure.

dgreene1 commented 4 years ago

@Eoksni instead of a custom template can you create a PR that enhances format to accept a regex?

I’ve made a separate issue (https://github.com/lukeautry/tsoa/issues/543) where we can discuss this new feature as opposed to this current thread which should continue to discuss the potential need for a middleware decorator.

WoH commented 4 years ago

Pattern and format are different things. Format applies to every data type, pattern is restricted to strings.

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

Jofemago commented 1 year ago

i have a question here, if i want to validate a paramater of the body, i can define that this parameter is string, but i want validate the format of the sting to be an uuid v4 or to be a string whitout spaces, now i can't do this whit the @format, Tsoa can't do these types of validations ?

WoH commented 1 year ago

Yes, using a @pattern + the regex