spotify / elitzur

Apache License 2.0
22 stars 20 forks source link

Failing AvroConverter conversion for optional-wrapped validation type #75

Closed kellen closed 4 years ago

kellen commented 4 years ago

Upgrading to scio 0.95 and running into a previously-worked-around AvroConverter issue.

I have a validation type:

  case class Country(data: String) extends BaseValidationType[String] {
    override def checkValid: Boolean = ...
  }

And a case class that uses it:

case class MyClass(
  ...
  country: Option[Country],
  ...
)

And an attempt to convert:

implicitly[AvroConverter[MyClass]].toAvro(a, AvroMyClass.getClassSchema).asInstanceOf[AvroMyClass]

To this schema:


{
    {
      "name": "country",
      "type": [
        "null",
        "string"
      ],
      "default": null
    },

And deep in the internals of DerivedConverter I’m seeing Country being evaluated against a schema of "string", which should work but which instead is passed to GenericRecordBuilder and results in:

[info]   org.apache.avro.AvroRuntimeException: Not a record: "string"
[info]   at org.apache.avro.Schema.getFields(Schema.java:220)
[info]   at org.apache.avro.data.RecordBuilderBase.<init>(RecordBuilderBase.java:54)
[info]   at org.apache.avro.generic.GenericRecordBuilder.<init>(GenericRecordBuilder.java:38)
[info]   at com.spotify.elitzur.converters.avro.DerivedConverter.toAvro(AvroConverter.scala:319)
[info]   at com.spotify.elitzur.converters.avro.AvroOptionConverter.toAvro(AvroConverter.scala:138)
[info]   at com.spotify.elitzur.converters.avro.AvroOptionConverter.toAvro(AvroConverter.scala:125)
[info]   at com.spotify.elitzur.converters.avro.DerivedConverter.toAvro(AvroConverter.scala:333)

I had previously worked around this by round-tripping to bytes which no longer works:

    def toAvro[To <: SpecificRecordBase](toSchema: Schema)(implicit ac: AvroConverter[From]): To = {
      val decoder = new BinaryMessageDecoder[To](new SpecificData(), toSchema)

      val rec: GenericData.Record = ac.toAvro(instance, toSchema).asInstanceOf[GenericData.Record]
      val bytes: ByteBuffer = new BinaryMessageEncoder[GenericData.Record](new GenericData(), toSchema).encode(rec)
      val ret: To = decoder.decode(bytes)
      ret
    }

My suspicion is that the Option wrapper is breaking an otherwise working path

anne-decusatis commented 4 years ago

Idrees looked into this a little and found that this was an implicit resolution issue where the Option converter was being bypassed entirely due to the location of the validation type definition.

There is a workaround, I confirmed that defining the implicit manually like this worked for Kellen:

implicit val simpleConverter = new AvroSimpleTypeConverter[Country, String] (this is private scoped to com.spotify.elitzur, so you'll have to use that package; once you define this you also have to exclude the default implicit simpleConverter for BaseValidationTypes from the avro._ import)

I am going to continue to leave this open for now to try and figure out (1) a way of changing packages/imports/etc so the implicits resolve in the correct order in this case or at least (2) a way of simplifying the workaround

anne-decusatis commented 4 years ago

I looked into this more with Kellen. In that pipeline, the error was conflated with an issue where a companion object needed to be defined for the BaseValidationType with SimpleCompanionImplicit in order for avro conversions to be correctly generated. Closing this issue as it's now been resolved in Kellen's pipeline. Feel free to open a new issue if you see something that confuses you!