avro-kotlin / avro4k

Avro format support for Kotlin
Apache License 2.0
188 stars 36 forks source link

Support For JSON type for BigQuery #216

Open NZNathan opened 1 month ago

NZNathan commented 1 month ago

I am generating an avro file to send up data to Googles BigQuery, and would love to have support for JSON types.

To get BigQuery to recognize a value in the avro is JSON, it requires the "sqlType": "JSON" annotation in the avro schema as below:

{ "type": {"type": "string", "sqlType": "JSON"}, "name": "json_field" }

I can't seem to be able to do this currently, please let me know if this is possible to do already or possible to add in a future enhancement.

Chuckame commented 1 month ago

Assuming that this schema is not a data class field schema, but the overall schema, your string value should be wrapped inside a value class, and its field should be annotated with AvroProp("sqlType", "JSON") but you need to wait for value classes and primitive support in avro4k v2, that should be released in the next few days.

Chuckame commented 1 month ago

But if it is the schema of the field named json_field, then just annotate the string field with AvroProp("sqlType", "JSON")

NZNathan commented 1 month ago

Hey, thanks for the response, I've tried using AvroProp, but it just adds the property to the schema like so:

{
    "name" : "json_field",
    "type" : "string",
    "sqlType" : "JSON"
  }

But it needs to be part of the existing type property, as it appears in my original post, so AvroProp won't work.

I feel like it will require a new schema type or perhaps a new annotation to be able to adjust the type property?

For some more context I'm just using a data class with a string field in it. The field json_field is populated from another data class I am turning into json.

@Serializable
    data class DetailRecord(
        val json_field: String
    ) { ... }
Chuckame commented 1 month ago

Oh I misread, you need this field into the type and not the field, that's not possible for the moment, only in v2 that has been released in RC yesterday!

With v1, the only way is to create the schema yourself using Schema.create (may be enough for you) or SchemaBuilder.

With v2, you could try wrapping the string field into a value class and annotation the value class' field with AvroProp. I think it will apply the prop on the type. Tell me if it's not working that way Not possible with v2, for the moment do the same as v1. I'm working on something to paste the AvroProp annotations from the value class to the underlying type

Chuckame commented 1 month ago

When released in next RC (2.0.0-RC3), you should be able to add your custom property that way:

@Serializable
value class BigQueryJson(@AvroProp("sqlType", "JSON") val value: String)

@Serializable
data class DetailRecord(
    val json_field: BigQueryJson
)
NZNathan commented 1 month ago

Sweet thanks! I'll try that out once 2.0.0-RC3 is released.

Chuckame commented 3 weeks ago

v2.0.0-RC3 has been released, tell me if it's OK for you!

Chuckame commented 2 weeks ago

Hello @NZNathan is it ok for you ?

NZNathan commented 1 week ago

Hey @Chuckame sorry currently out of the country, won't get a chance to check until July. I'll let you know once I'm back!