Option to validate responses

gsakkis commented 1 year ago

Summary

As posted on Discord, I was surprised to realize that the return type annotation is not validated; e.g. this doesn't raise an error:

@get("/")
async def hello_world() -> float:
    return "Hello, world!"

The reason is that this should be enforced by a static type checker; runtime validation can be costly. However in general there are several ways for static type checkers to miss finding such errors (Any, cast, # type: ignore, untyped 3rd party libraries, etc) so it would be nice if there was an optional (off by default) parameter (say, validate_response=True) for the user to trade off performance for type safety.

Potential API:

@get("/", validate_response=True)
async def hello_world() -> float:
    return "Hello, world!"

As for other such parameters, this should be applicable to all layers of an application (handlers, controllers, routers, application object).

Basic Example

No response

Drawbacks and Impact

No response

Unresolved questions

No response

Funding

If you would like to see an issue prioritized, make a pledge towards it!
We receive the pledge once the issue is completed & verified

Goldziher commented 1 year ago

Runtime validation doesn't really fit with our design at present. Adding this in a performant and consistent way is no small feat.

We can consider perhaps an optional integration with bear type, what do you think?

gsakkis commented 1 year ago

Runtime validation already takes place as part of converting inputs (path parameters, request body), I guess you mean output validation? I don't have strong opinion on how it's implemented as long there is a consistent way for all handler annotations to be enforced.

Goldziher commented 1 year ago

Runtime validation already takes place as part of converting inputs (path parameters, request body), I guess you mean output validation? I don't have strong opinion on how it's implemented as long there is a consistent way for all handler annotations to be enforced.

Yes, but we have two types of validation - validation of configuration during startup, and validation using msgspec.

To validate outputs we will need to run validations on each return.

This is going to be seriously detrimental to performance.

provinzkraut commented 1 year ago

This is going to be seriously detrimental to performance.

The request was to do this optionally, not by default, so I don't see an issue here.

gsakkis commented 1 year ago

To validate outputs we will need to run validations on each return.

How is this different/worse than validating request bodies on each request?

In any case it will be opt-in, a user will select to sacrifice a bit (or a lot, needs to be measured) of performance for type safety. Btw that's what FastAPI does by default (or always, not sure).

Goldziher commented 1 year ago

To validate outputs we will need to run validations on each return.

How is this different/worse than validating request bodies on each request?

In any case it will be opt-in, a user will select to sacrifice a bit (or a lot, needs to be measured) of performance for type safety. Btw that's what FastAPI does by default (or always, not sure).

Opt in is fine, I was explaining the differences.

The main one is that we do the request validation using msgspec, which is way faster than Python

gsakkis commented 1 year ago

There is a (good) chance that msgspec will cover this case too.

guacs commented 1 year ago

What is the use case for this? I would assume that the validity of the responses would be tested through a combination static type checkers and tests.

As for the point regarding the difference between validating request bodies and the responses, the request bodies are not under the control of the application but is user input which cannot be blindly trusted and so has to be validated.

gsakkis commented 1 year ago

I hate to bring up the FastAPI docs twice in the same issue but it makes some good arguments, especially wrt to data filtering:

class BaseUser(BaseModel):
    username: str
    email: EmailStr
    full_name: str | None = None

class UserIn(BaseUser):
    password: str

@app.post("/user/")
async def create_user(user: UserIn) -> BaseUser:
    return user

Type checker is happy but the response includes the password. Oops.

provinzkraut commented 1 year ago

@gsakkis I would argue that this problem isn't a great example as it is specific to how FastAPI promotes data modelling to be handled. In Litestar you'd use a DTO for this:

class User(BaseModel):
    username: str
    email: EmailStr
    full_name: str | None = None
    password: str

@app.post(
  "/user/", 
  dto=PydanticDTO[User], 
  return_dto=PydanticDTO[Annotated[User, DTOConfig(exclude={"password"})]]
)
async def create_user(user: User) -> User:
    return user

The issue in your example comes from the fact that you're forced to mix up concerns. It's putting the responsibility of proper serialisation and data validation to inside the handler, where it should arguably not happen. You want the handler to receive a User and return a User. What happens to that object should be described and handled separately.

gsakkis commented 1 year ago

The FastAPI example is perfectly valid in Litestar too; nothing prevents or warns you against it (or conversely, nothing forces you to use a DTO).

You want the handler to receive a User and return a User.

This might be true only at an abstract sense, in practice you receive a bunch of structured data and return a different set of structured data. Choosing to use the same name for them is debatable and potentially confusing (if for example the generated OpenAPI spec names User both the body request and the response, I don't know).

I haven't used DTOs so far but based on the examples I've seen they look like syntax sugar for automating the definition of (simple) variations of a given model, e.g. a UserOut model with 3 fields in this example. What if you want to return an obfuscated version of the email like ******@example.com? AFAIK you can't do this with a DTO so you have to define a UserOut model and convert the input User to it anyway.

provinzkraut commented 1 year ago

The FastAPI example is perfectly valid in Litestar too; nothing prevents or warns you against it (or conversely, nothing forces you to use a DTO).

I think you're missing the point I was trying to make a bit. What I was trying to say is that in FastAPI, the example you have provided is the canonical way of doing data filtering. This way exposes you to the problem you brought up. In Litestar, the canonical way of doing data filtering does not expose you to this problem.

Both approaches are doing the same thing; They use one model to transform the returned data into a different shape. It's just that in your example, that transfer model is defined explicitly, and set implicitly (via the return type), whereas for Litestar, it is defined implicitly (via the PydanticDTO) and set explicitly (via return_dto).

What does not change however is that you still have to use it correctly to get the benefits you want. One could very well argue that in the same scenario the return type could mistakenly be annotated with UserIn. Then you'd also expose the password.

In my opinion, relying on this mechanisms for security reasons is a really bad idea in general, since it always requires you to configure things correctly. This sort of stuff should be checked with testing.

As for your specific question regarding the obfuscation of passwords: You'd use a pydantic.SecretStr, as is commonly advised best practice when dealing with these things in Pydantic. There actually is an issue for implementing something like this that works across everything serialisable in Litestar.

Anyway, I just wanted to say that I'm not strictly opposed to return type validation. I just think that this example isn't a great use case of that, since there are better ways to solve this particular problem, and that a lot of the use cases are already covered by DTOs.

Goldziher commented 1 year ago

Gave it you @gsakkis , go ahead. Performance thought is paramount

litestar-org / litestar