sirthias / borer

Efficient CBOR and JSON (de)serialization in Scala
https://sirthias.github.io/borer/
Mozilla Public License 2.0
225 stars 16 forks source link

Add fully automatic codec derivation #36

Closed sirthias closed 4 years ago

sirthias commented 5 years ago

Currently borer only implements "semi-automatic" codec derivation, which means that it can construct codecs for case classes and ADT super-types with a single deriveCodec line, but requires that codecs for all members (case classes) or sub-types (ADTs) are already implicitly available.

Fully automatic derivation means, that a single deriveCodec[T] will suffice to have borer construct a codec for T and all member-/sub-types.

Among other things this requires

plokhotnyuk commented 5 years ago

Additional points would be:

mushtaq commented 4 years ago

Now that we have fully adopted borer everywhere (akka-remoting, http, redis), we are trying to evangelise it to other teams.

Upvoting this issue once again because I feel that fixing this will really help me in making that pitch.

Thanks for a wonderful library @sirthias !

sirthias commented 4 years ago

Great to hear that borer works nicely for you, @mushtaq! I'll see what I can do wrt to this ticket but I'd rather not hold my breath in anticipation of a quick fix. Fully automatic derivation really is full of traps...

sirthias commented 4 years ago

I had an idea of how fully automatic derivation could be implemented without a lot of macro complexity and it turns out that it's indeed possible, without sacrificing flexibility, derivation or runtime performance or maintainability.

1bdfdfa02c5b7757e85bffe554d4a400b4c2a520 does it. I'll add another commit with the documentation and then release the whole thing with 1.3.0 as a christmas present... :)

mushtaq commented 4 years ago

@sirthias what a great news! Can't wait to try it out. I have comment on this test:

"simple" - {
    implicit val codec = ArrayBasedCodecs.deriveAllCodecs[AnimalX]
    roundTrip("""["Labrador","Lolle"]""", Labrador("Lolle"): AnimalX)
    roundTrip("""["TheLastTiger",[]]""", TheLastTiger: AnimalX)
}

To avoid type ascription (Labrador("Lolle"): AnimalX), I almost always end up doing this:

"simple" - {
    val codecValue = ArrayBasedCodecs.deriveAllCodecs[AnimalX]
    implicit val codec[T <: AnimalX]: Codec[T] = codecValue.asInstanceOf[Codec[T]]
    roundTrip("""["Labrador","Lolle"]""", Labrador("Lolle"))
    roundTrip("""["TheLastTiger",[]]""", TheLastTiger)
}

Is there a way, it can be addressed directly while calling deriveAllCodecs?

mushtaq commented 4 years ago

Also, I notice this comment in MapBasedCodecs.scala:

NOTE: If `T` is unary (i.e. only has a single member) then the member value is written in an unwrapped form,
    * i.e. without the array container.
def deriveEncoder[T]: Encoder[T] = macro Macros.encoder[T]

If I understand this right, if an Adt has unary case classes, they will get the equivalent of ArrayBasedCodecs.deriveUnaryCodec even for the MapBasedCodecs which is a fantastic default!

sirthias commented 4 years ago

The Codec typeclass (just like Encoder and Decoder) is intentionally invariant. This is because, in most cases, you actually want an Animal to be encoded differently than a Tiger for very good reasons. The encoding of an Animal needs to transport the information what type of animal the encoding represents (the typeId) whereas the encoding of a Tiger doesn't, because we already know that it's a Tiger and the Tiger type is concrete and doesn't have several possible sub-types.

So, it really makes a difference whether you call

Cbor.encode(tiger).toByteArray

or

Cbor.encode(tiger: Animal).toByteArray

Therefore I don't think it makes sense for borer to contain any further conversion logic. What you do with the codecs and whether you really want to cast things the way you show is your responsibility. I personally don't think this implicit is a good idea.

I'd rather do something like this:

def serializeAnimal(animal: Animal) = Cbor.encode(animal).toByteArray

serializeAnimal(tiger)

No casts required.

sirthias commented 4 years ago

Also, I notice this comment in MapBasedCodecs.scala...

Ah, thank you for spotting this! This comment is actually incorrect and mere copy-pasta from the ArrayBasedCodecs. I'll get this fixed.

The MapBasedCodecs do not have special handling for unary case classes for the simple fact that that is already provided via the ArrayBasedCodecs. So, if you want a case class to be encoded without wrapping you can use ArrayBasedCodecs.deriveUnaryCodec even if all its sibling types are encoded based on maps.

However, what makes sense is to add MapBasedCodecs.deriveNonUnaryXXX macros that complement the ArrayBasedCodecs.deriveUnaryXXX ones. This way you can mix the two and be notified by the compiler if the type arity changed and thus a different encoding strategie needs to be used. I've just added #130 to track this...