Closed raderio closed 4 years ago
Thank you @raderio for opening this discussion.
Here some questions and thoughts regarding your suggestion:
Structural validation vs. domain validation
It makes sense the check for the presence and correct type of information before object creation. That is the current approach but currently needs to be done manually through JSONCodec
s. With annotation processing (#16) this would become much easier, as you've also suggested, but it's still work-in-progress and has a long way to go. Annotations would then be used to test for the correct structure and types of all information during parsing before object creation and can help with useful context-aware error messages. The result are type-safe values based on Kotlin and custom types.
Domain validation like min/max, length, regex, etc. on the other hand is not just about the structure but the domain-specific traits every single property has. JSON parsing in my opinion shouldn't be about any domain-specific validation but only help transporting a data model using a defined type-safe structure from one location to another, i.e. as close to POJOs as possible. The model can then be used by domain-specific code to apply whatever rules it may deem important depending on the context the model is being used. This provides a better abstraction between data model, data (de)serialization and data validation rather than mixing up the latter two.
Domain validation is complex Take Twitter for example. There are complex rules involved to calculate how long a tweet actually is, including information about URL shortening and photo uploads. Calculating that length within the model or the deserialization layer for validating against the maximum length would put a lot of domain logic with a lot of dependencies in these layers. Also there may be older tweets in the database which have followed different validation rules and would no longer be valid today. They may still be valid in the model but invalid only when creating new tweets, hence it depends on the context.
Should a developer still want to perform such validation then it can easily be done directly in the constructors or initializers of model classes and completely independent of the deserialization. This allows the full use of Kotlin's extensive standard library as well as third-party (validation) libraries. Replicating validation with some special annotations will likely just reinvent part of the wheel. The overhead of implementing and maintaining such a validation system will likely outweigh the benefits by far.
In the majority of scenarios clients send valid JSON
If you have only one version of API and only one client, than yes, but we have 3 clients(we, android, ios) and should support several version of each.
Take Twitter for example. There are complex rules involved to calculate how long a tweet actually is
This is more like an exception for rule, usually you have an input with max length
In the majority of scenarios clients send valid JSON
If you have only one version of API and only one client, than yes, but we have 3 clients(we, android, ios) and should support several version of each.
How can validation which goes beyond structure help you in this situation? From the client's perspective there will be useful error messages in any case. Or is it about the documentation? I'm also used to having multiple clients each having widely different versions as typically happens as you release more and more app updates. I've stopped versioning APIs long time ago and instead implement capabilities. Each client tells in the request what API capabilities they support and the API server will adjust their functionality and response format accordingly to consider and/or compensate for missing capabilities.
Take Twitter for example. There are complex rules involved to calculate how long a tweet actually is
This is more like an exception for rule, usually you have an input with max length
It's a more extreme one, yes, but there are many more. In some cases the values may depend on the client, on the database or other external or complex information. My suggestion is instead of having part of the validation in the parsing or data layer and part of it in the business logic layer have all at one place so there is no guessing at runtime what data has been validated and what not. Since not all business logic can be at the data/parsing layer (esp. not in micro-architecture scenarios) it makes more sense to have as much as possible close to the business logic. This also reduces the potential for error because data will be validated very close to their actual use rather than once and then traveling through the system until the consumer can no longer be certain that it was validated as expected.
Each client tells in the request what API capabilities they support and the API server will adjust their functionality and response format accordingly to consider and/or compensate for missing capabilities.
Is this technique has a name, or maybe you can provide links to some articles?
it makes more sense to have as much as possible close to the business logic
yes, but it this case the validation is not so declarative, also if it is not done by annotations it is harder to do the documentation, because in case of annotations you can generate it
Each client tells in the request what API capabilities they support and the API server will adjust their functionality and response format accordingly to consider and/or compensate for missing capabilities.
Is this technique has a name, or maybe you can provide links to some articles?
I've done that without checking if anyone else is doing that already, so if there is a name then I don't know it, nor do I know any articles. It was working very good though so maybe I should write an article about it :)
it makes more sense to have as much as possible close to the business logic
yes, but it this case the validation is not so declarative, also if it is not done by annotations it is harder to do the documentation, because in case of annotations you can generate it
I agree that documentation-wise it would be a lot simpler. If annotation-based validation works for you then this is totally fine to use. I'd do it directly in the data model rather than in the parsing layer though. And the codecs which are used for parsing then simply use the generated constructors or factory methods which in turn implement the validation. Wouldn't that work for you?
I've done that without checking if anyone else is doing that already, so if there is a name then I don't know it, nor do I know any articles.
Is it something like https://www.youtube.com/watch?v=M2KCu0Oq3JE ?
It was working very good though so maybe I should write an article about it
Will be great
I've done that without checking if anyone else is doing that already, so if there is a name then I don't know it, nor do I know any articles.
Is it something like https://www.youtube.com/watch?v=M2KCu0Oq3JE ?
Exactly like that! Very interesting video with good ideas on pushing this forward. Thank you for sharing :)
Usually on deserialization we want to collect all validation errors and return them in response to the client. In order to catch all error the validation should be done before object creation.
If it will be done after object creation, in
init
block, than validation will be separated in 2 phases, first one will be validation regarding non-nullable field, because we cannot create objects if they have non-nullable properties. The second one will be to do structural validation: min/max values, length, regexp pattern, etc.Example
class Data(val name: String, val age: Int, val email: String)
when we send a json{"email": "xyz@xyz.com", "name": "A"}
we should return a response where to indicate thatage
can not be null andname
size must be greater than 2, at once to send all the errors.Also, structural validation will be great to do based on annotations and not in
init
block, something like JSR 303. This will be useful for documentation generation, it is far easier to generate based on annotations.How I see the flow: