cdimascio / express-openapi-validator

🦋 Auto-validates api requests, responses, and securities using ExpressJS and an OpenAPI 3.x specification
MIT License
911 stars 209 forks source link

Any work-around for ajv's undesired number coercing rules? #268

Open bluenote10 opened 4 years ago

bluenote10 commented 4 years ago

Is your feature request related to a problem? Please describe.

Since this library is using ajv under the hood, the issue is probably a fundamental problem in ajv, which I have reported here. I'm not sure if this will be actioned on ajv side, so I'm wondering if a work-around is feasible in terms of the hooks in the middleware.

In short: Coercing the value null for a number currently maps to 0. I'm working on an API where request data may contain NaN/Infs basically everywhere, and the roundtrip JSON.stringify => JSON.parse()=> validate maps them to 0, leading to bugs everywhere.

Describe the solution you'd like

Would it be possible to pass in special validation hooks, i.e., a callback system that would allow to handle the case "received value null for type number, how to proceed?". I have no idea if an approach like that would be feasible.

Describe alternatives you've considered

You are the experts, are there better ways to work-around ajv's behavior?

Additional context

See linked issue above.

cdimascio commented 4 years ago

@bluenote10 One possible option is to add nullable: true for the number property

    TestObject:
      type: object
      properties:
        count:
          type: integer
          nullable: true
bluenote10 commented 4 years ago

As I have tried to explain in the other ticket already, this is not really a solution.

Maybe it helps to explain the context. The API is in the context of numerical data science, where data gets pushed from Python clients. In the data science community it is a standard to represent missing numbers as NaN (Numpy/Pandas follow that rule). The API consists of maybe 15 types all containing mainly number fields. The total number of number fields is approx 100. Now basically all of these fields can contain a NaN on sender side (Python), and all of them should be transferable 1-to-1 into the backend.

The nullable: true solution means that every occurrence of number in the public API must be replaced by a nullable number. The implications of doing this are out of proportion. The project is written in TypeScript, and we generate the API types from the OpenAPI specs. For nullable: true numbers, the generator correctly produces the type number | null, and the TypeScript compiler guarantees that you don't use any of the fields without a null check (i.e., you can no longer do a + b when they have type number | null). This means that our types are no longer directly usable. We now have to do a manual validation for every type: Go over the number fields, and replace every null by the number NaN. This manual validation run is very tedious, and simply a significant code bloat. Since internally we want to operate on the types where the fields have changed from number | null to number we even have to duplicate all type definitions and have an internal one for each external one with the fixed number type. Overall the boilerplate required by this solution is crazy -- several extra screen pages of code from manual validation, type duplication, longer OpenAPI specs.

All this would be so much easier if there was a small coerce hook that would allow to define the mapping null => NaN in the context of number -- which is what the Python world does already.

The null => 0 mapping is a quirk of the Javascript language. It is not what other languages do, and certainly not a good default for validation. In the context of the JSON limitation JSON.stringify(NaN) => null isn't it clear that remapping null => 0 is a really bad idea (how do people tell a wrong 0 from a true 0 in practice? do must users simply have a bug here, and may not even be aware of it?), whereas null => NaN would allow to correctly roundtripping NaN.

cdimascio commented 4 years ago

Thanks @bluenote10, I'll explore possible options.

bluenote10 commented 4 years ago

Thanks for looking into it!

In my opinion the best solution would be to have a custom coercion mechanism in ajv already, and express-openapi-validator only forwards setting to it. That would allow users to define their desired behavior regarding number coercing, and also pass in custom "coercers" for any user type T.

cdimascio commented 3 years ago

@bluenote10 this behavior will change based on #387. No longer will a null number be coerced to 0, instead it will remain null and throw a 400 validation error, stating that the value should be number

bluenote10 commented 3 years ago

Still doesn't allow to round trip NaNs -- which is really an important use case in data science applications.