Allow documentation for individual case class properties

leonardehrenfried commented 5 years ago

Right now it is not possible to add descriptions to individual fields of JSON body responses. To us that would be highly desirable as it would allow us to explain to the reader the meaning of the field and not just the OpenAPI type and format.

If you are not sure what I'm talking about, here is a screenshot of a Redoc-powered rendering of the OpenAPI YAML.

Screenshot from 2019-10-04 12-04-00

The red boxes show where the property's description would be rendered.

I can imagine a couple of places were this information could be retrieved from:

scaladoc @param tags (probably not available but would nevertheless be great)
a custom annotation
existing Schema implicit instance, but would have to add a data structure to put the information in
another optional implicit

It's quite likely that if an implementation strategy is agreed, I can contribute this.

joroKr21 commented 5 years ago

I think an optional format field in Schema is needed anyway, because sometimes it's tied to the type (e.g. I may have a Country type which is a string, but has to be an ISO 2-letter country code, and I would like this to be specified in the implicit SchemaFor[Country]).

Then you could add a custom @format annotation that can be picked up by magnolia derivation to add this optional format to the field schema. This would be more useful in cases where the fields have primitive types, but within the context of a case class have a more specific meaning.

lukoyanov commented 5 years ago

I'd love to see this feature implemented too.

guersam commented 5 years ago

I want a description field as well as format.

Regarding description, I was trying something similar in endpoints with annotation and shapeless, but hasn't been able to finish it.

jan0sch commented 5 years ago

It would also make sense to support several concrete types more like uuid for example.

Furthermore it would be cool to support extracting information from possibly used refined types to fill the format field. This could be done by using an implicit Witness like in the following code: https://github.com/fthomas/refined/blob/9f43b59f530cd35501ec131b6d78adf45349454c/modules/core/shared/src/main/scala/eu/timepit/refined/string.scala#L148-L154

lukoyanov commented 5 years ago

@leonardehrenfried did you start implementing this (it would greatly improve documentation quality for frontend devs)? @adamw would you please advice on the implementation strategy mentioned in the first message.

leonardehrenfried commented 5 years ago

I have not.

As there are many ways in which this could be implemented, with a wide variety of compatibility and complexity concerns, I'd like to get a rough guide from the maintainers before I start an implementation.

@lukoyanov I guess you can give this a go and see what the response to your suggestion is. A PR often stimulates the decision making progress.

adamw commented 5 years ago

I don't have a good idea how to implement this yet.

I'm sure it would bring in larger changes to the current Schema/SchemaFor structure. As it is now, a Schema value isn't in any way connected to the type it describes. Once we start adding meta-data, such as description, this becomes type-specific. So we would probably need to remove SchemaFor, and instead have a Schema[T] which describes the schema for the object. Then, a SString[String] could be distinct from SString[String @@ Name], and we could have multiple SString instances with different descriptions for different types.

(we could add meta-data to SchemaFor[T], but this would mean that this would stop being just a "container for a schema", and would become something more).

Going even further, we could experiment with enriching schemas even more, so that we are able to generate whole codecs out of them. But that's more of a research topic :)

adamw commented 5 years ago

Of course, would be great to hear other ideas! In the form of comments or PRs, though I can't promise they'll get merged ;)

guersam commented 5 years ago

Caliban utilizes magnolia for schema derivation with support for class/field description via custom annotations:

https://github.com/ghostdogpr/caliban/blob/v0.1.0/core/src/main/scala/caliban/schema/Schema.scala#L382-L392

adamw commented 5 years ago

Here's my proposition. First, a definition:

Schema[T] describes the shape of the low-level, "raw" representation of type T

A schema is one of the following: a basic schema (string, int, number, boolean, date, ...), an array, binary, object (product / coproduct / open product) or a reference to an object (for recursive schemas). The representation isn't far away from what we have today, with the main difference being that we get rid of SchemaFor and parametrise Schema instead with the target T type.

Apart from the information that schema carries with its type, it contains:

a description
a format, giving more low-level details about the representation of the type

So we would have, for example, a case class SString[T](description: Option[String], format: Option[String]), with a default implicit instance implicit val stringSchema: Schema[String] = SString[String](None, None).

In case of custom wrapper types, e.g. case class Wrapper(s: String), you would have to define implicit val wrapperScheam: Schema[Wrapper] = SString[Wrapper](None, None).

Customising schemas

To customise schemas, all of the built-in implicits (including the ones generated by magnolia) would be wrapped in a Derived[T](v: T) class, with a low-priority implicit conversion from Dervied[T] to T (I think there's a similar trick with Exported in cats).

So to customise a schema for a given type, you could do:

implicit val mySchema: Schema[Person] = customise(implicitly[Derived[Person]].v)

While you could add field-level customisations this way, it could be cumbersome. That's why we would also introduce a quicklens/diffx-style API for modifying a nested schema, e.g.:

implicit val personSchnea: Schema[Person] = implicitly[Derived[Schema[Person]]]
    .modify(_.name.first)(_.description("XYZ")) // the second param is a fn Schema => Schema
    .set(_.address.each.zipcode)(zipCodeSchema) // the second param is a Schema

Specifying the path would be a macro, which would translate this into the appropriate operations on the parent Schema. Maybe we could re-use quicklens here somehow?

Relationship with validator

One of the philosophical questions is, should the validator be part of the schema? Should we derive json encoders/decoders from the schema? Both sound tempting.

But I think it's worth keeping the divide. On the "cons" side, we have the duplicate recursive logic of building up validators/schemas/encoders for composite types. On the "pro" side however, we have a clear division of responsibility:

schemas describe the low-level shape of the value; however, they don't specify the final format of the values, and are not concerned with constructing/deconstructing objects from raw value tuples, or decode failures
validators describe the subset of values for potentially higher-level types. They are context-free, that is they only constraint values based on the value itself, without knowing the parent object
encoders/decoders translate the value to the low-level format

Hence, a validator can influence the schema; the high-level description of the possible values can narrow down the format of the low-level representation, for example. However, not the other way round - it's not generally possible to derive a validator from a schema. Let's consider the uuid format and the UUID high-level type. In this case, the format is an intrinsic property of the high-level type, no validation apart from decoding is required.

Why not annotations

Adding @Description or @Format annotations could also be a viable solution, however it has one major drawback: it requires modifying the target datatype. So while this could be an addition to the mechanism described above, it's not a real alternative; we would need a way to change the schemas "by hand" anyway, to be able to describe existing datatypes, which we cannot annotate.

adamw commented 5 years ago

Before we start implementing the above, would be great to hear if you see any show-stoppers in the above!

guersam commented 5 years ago

I usually like annotations to add a description to each product member for ergonomics, but your proposal sounds great to provide a solid, composable ground first.

+1 on the removal of SchemaFor that has been confusing.
+1 on the separation of schema and validator. It won't be too late to merge them after we get more real use cases.
The suggested API to modify nested schema for field-level customization looks fine at first glance.

jan0sch commented 5 years ago

On a first glance I have no objections. I really like the idea of providing a "customisation api". On the other hand it might be sufficient (and may reduce implementation cost) to just provide documentation/instructions/examples on how to tackle such problems using optics. Currently I'm leveraging monocle for this but it ain't easy.

adamw commented 5 years ago

@jan0sch yes, the path-based api would be a nicer syntax for optics. The problem here is that at the level of schemas we are dealing with string-based representation of the class (field maps: Iterable[(String, Schema)], collections as SArray etc.)

So the path would translate to navigating these structures and updating them appropriately. We can probably also provide a mid-level API as in diffx:https://github.com/softwaremill/diffx/blob/master/core/src/main/scala/com/softwaremill/diffx/Diff.scala#L14

jan0sch commented 5 years ago

It never ceases to amaze me how many really cool projects you softwaremill people have in the wild. :-)

adamw commented 5 years ago

I've started working on this, see https://github.com/softwaremill/tapir/compare/schema2?expand=1

In the end the main abstraction is now Schema[T], which wraps a SchemaType value (which can be a string/array/etc.). The two-level representation is needed as there's no point in duplicating all the metadata in each schema type.

adamw commented 5 years ago

Released in 0.12.0 (without the macro, yet - #295) See: https://tapir-scala.readthedocs.io/en/latest/endpoint/customtypes.html#customising-derived-schemas

softwaremill / tapir