Open adamw opened 1 year ago
Here are some notes after my initial analysis:
Some of our requirements can be addressed with the @upickle.implicits.key
annotation. I don't know if we can add annotations using macros, here's a thread where I'm asking for advice to figure this out. In cases where that's the only viable possibility, I've put a 🔑 icon to emphasize this.
@upickle.implicits.key
annotation set on a field. objectAttributeKeyReadMap
and objectAttributeKeyWriteMap
in our custom pickler which extends AttributeTagged
. This method is recommended for customizing field name transformations like snake_case, but it can also be leveraged for other kinds of transformationsAtributeTagged
should be enough to achieve this@upickle.implicits.key
annotation on the enum can be used to rename the value (yes, it's called "key", but in this case it's used by uPickle to transform values). 🔑
{ "customerStatus": "5" }
, but not { "customerStatus": 5 }
$type
, but it can be changed if tagName
is overridden in a custom pickler@upickle.implicits.key
on the enum 🔑tagName
, value can be set by putting @upickle.implicits.key
on the class 🔑AttributeTagged.taggedObjectContext
to return a custom ObjectVisitor
with only some of the logic changed. Sounds like a tricky ground.CaseClassReadereader.storeDefaults
, example hereFirst, a side note - if you're not lucky on the scala-users forum, you can also try dotty discussions in the metaprogramming section: https://github.com/lampepfl/dotty/discussions/categories/metaprogramming
Second sinde note: I think a good "terminology fix" might be to call enumerations
only "true" enumerations, that is Scala 3 enum
s, where all cases are parameterless. If the cases have parameters, that's only a sugar for a sealed trait
.
What is kind of worrying is the some cases can only be handled with 🔑 . So either we find a way to add annotations to a type using macros, or ... ? I guess there's no alternative really.
Well, except rewriting the pickler derivation. After reading the upickle code, is that even feasible?
I see, thanks for explaining with enumerations, let's use the terminology as you suggested. The discussion board you posted looks promising. I was able to find a fresh thread on refining types, which may be helpful to deal with annotations. Working on this now.
I was thinking about a possible implementation strategy, and here's what I came up with.
The first constraint is that we should honor existing ReadWriter
instances when they exists - either for the built-in types, or some esoteric ones.
The second constraint is that derivation should follow standard Scala practices, that is be recursive - so that the derived typeclass for a product/coproduct is created using implicitly available typeclass instances for children. This rules out Codec
as the typeclass, as it's not recursive - only the top-level instance for a type is available.
Still, we need to derive both the ReadWriter
instance and the Schema
instance. So maybe we should do just that: derive that pair, with an option to convert to a Codec
. E.g.:
case class Pickler[T](rw: ReadWriter[T], schema: Schema[T]):
def toCodec: JsonCodec[T]
implicit def picklerToCodec[T](implicit p: Pickler[T]): JsonCodec[T] = p.toCodec
The Pickler
name is quite random, but it's the best I came up with so far ;)
Another design decision is what means of configuration to provide for the derived schemas/picklers. We already have two ways of customising schemas: using annotations and by modifying the implicit values. Originally I suggested adding a third one (explicitly providing an override for annotations), but maybe that's not necessary and we can use what's already available.
That is, the implicitly available Schema
for a type could be used to guide the derivation of the ReadWriter
- if it's missing. The schema already has all that we need: user-defined field names and default values. Btw., here #2943 would be most useful to be able to externally provide alternate field names.
This also means that the Pickler
derivation would have to assume, that the schema's structure follows the type's structure (when it's a product/coproduct), and report an error otherwise.
Now the main complication is implementing Pickler.derived[T]
. I think it should follow more or less these rules:
Schema
and ReadWriter
are already implicitly available in the scope, use them to create a Pickler
SchemaMagnoliaDerivation
to create the new typeclass instance. Side note: we could simply do Schema.derived[T]
, but that could have negative performance implications, as it would do the nested lookups once again. So it could be slow.ReadWriter
is missing (i.e., not implicitly available), create one for a product/coproduct, using what's available in the Schema
To support special cases, such as various enumerations or inheritance strategies, we can use a similar approach as currently, that is provide methods on Pickler
to create the instances: Pickler.derviedEnumeration
(similar as the method on Schema
and Codec
), Pickler.oneOfUsingField
, Pickler.oneOfWrapped
(similar as on Schema
).
That way we would use the "standard" Scala way of configuring generic derivation - specifying the non-standard instances by hand - instead of inventing our own one.
Using the schema to create the ReadWriter
instance means that it would be created at run-time - as only then, we have access to the specific Schema
instance (which might be user-provided and computed arbitrarily). So at compile-time, we would only generate code which would do the necessary lookups / create the computation.
Of course, there might a hole in the scheme above and it might soon turn out that it's unimplementable ;) WDYT @kciesielski ?
Leaving some notes after our recent discussion with @adamw:
Pickler
, and we want to allow deriving picklers without users providing schemas.Pickler[T]
with user-provided Schema[T]
, we would break the mechanism of Pickler creating its own schema out of child schemas from summoned child picklers. That's why we emit a compilation error when a Schema
is in scope, but no Reader/Writer
. Either both Schema/ReadWriter
is provided or the Pickler takes care of deriving them.Pickler.derivedCustomise[Person](
_.age -> List(@EncodedName("x")),
_.name -> List(@EncodedName("y"), @Default("adam")),
_.address.street -> ...
)
Yes, looks correct :) In the future we might also want to add Schema.derivedCustomise
for consistency, and maybe depracte the .modify
variant of schema customisation then?
Reopening for possible jsoniter work
Currently, to create a json body input/output, both a
Schema
and json-library-specific encoders/decoders are needed. This means, that generic derivation is typically done twice (once for json encoders/decoders, once for schemas). Moreover, any customisations as to the naming strategy etc. need to be duplicated, often using different APIs, both for the json library and for the schemas.It would be great to do the configuration and derivation once - but to do that, we would need to provide a module which would provide joint json encoder/decoder + tapir schema derivation. In other words, we would need to write code which derives a
JsonCodec[T]
(this includes theencode
,decode
andschema
).Doing this for all json libraries would be highly impractical, and a ton of work, for which we don't have resources. That's why I'd like to approach this using the json library that will be included in the Scala toolkit - that is, uPickle. uPickle can use a better derivation mechanism anyway (as our blogs have described), so it might be an additional win for our users.
Such a derivation would have to be written using a macro - and as we know, these are different in Scala 2/3. I think we should target Scala 3.
So summing up, the goal of the new module is to:
JsonCodec[T]
for supportedT
typesWhile it might seem that the derivation could be implemented using Magnolia, I think writing a dedicated macro, which could utilize Scala 3's
Mirror
s, would actually be better. First, we would directly generate the code, instead of generating an intermediate representation, which is only converted to the final codec at run-time. That's a small performance win. But furthermore, we can provide better, contextual, error reporting. And excellent errors is something I'd like to be a priority for this task. I've done some experiments with derivingSchema
using a macro directly here, but the work there has unfortunately stalled.As for configuring the derivation, we should take into account the following:
Schema.annotatations
on a perf-field/per-type basis - e.g.@encodedName
should influence both the schema, and the generated json enoder/decoderConfiguration
(global field name transformers etc.)Configuration
, to configure inheritance hierarchy serialization. This should include:Schema.oneOfWrapped
)In the end, the user should get an alternative to the current
import sttp.tapir.json.upickle.*
+ optional imports for auto-deriving uPicklesReader
/Writer
& tapir'sSchema
; the alternative would definejsonBody
etc. as the integration today, plus the macro to derive theJsonCodec
.Summing up, the top-level requirements for the macro are: