Closed potfur closed 10 months ago
Hey @thake & @sksamuel 👋
Is there any chance this could be reviewed/approved at some point?
Hey, I'll take a look
Thanks @Chuckame, much appreciated!
@potfur, thank you very much for the PR and sorry for the delay.
Wouldn't it be enough to just change @AvroProp
to be @Repeatable
? This would raise the minimal kotlin level to 1.6 which is acceptable. Code:
@SerialInfo
@Target(AnnotationTarget.PROPERTY, AnnotationTarget.CLASS)
@Repeatable
annotation class AvroProp(val key: String, val value: String)
//Example usage
@Serializable
@AvroProp("cold", "play")
@AvroProp("some", "thing")
data class TypeAnnotated(val str: String)
Another option which does not require kotlin 1.6 is to define the new annotation @AvroProp
like this:
@SerialInfo
@Target(AnnotationTarget.PROPERTY, AnnotationTarget.CLASS)
annotation class AvroProps(vararg val props: AvroProp)
//Example usage
@Serializable
@AvroProps(
AvroProp("cold", "play"),
AvroProp("some", "thing")
)
data class TypeAnnotated(val str: String)
I'm in favor of the first solution as this is a classic example of why repeatable annotations have been introduced. But I would like to hear your opinion on it.
The problem that I'm trying to solve is to be able to define
"tags": ["val1", "val2"]
From what I know, repeating annotation will result in something like:
"tag1": "val1"
"tag2": "val2"
So, it's about being able to have list of values for key instead multiple keys with single value.
With the comment of @thake it would result to something like:
@AvroProp("cold", "play")
@AvroProp("cold", "some")
@AvroProp("cold", "thing")
data class TypeAnnotated(val str: String)
Personally I don't really like it as it feels really verbose.
Or, we could take into the middle: Just modify AvroProp
to accept multiple values using vararg or array. (breaking change, but cleaner)
@potfur, thanks for fixing my confusion ;)
I'm totally with @Chuckame, the breaking change seems to be legit. @potfur can you also add the repeatable annotation?
@SerialInfo
@Target(AnnotationTarget.PROPERTY, AnnotationTarget.CLASS)
@Repeatable
annotation class AvroProp(val key: String, vararg val values: String)
If breaking change is way to go - even better.
But an near invisible breaking change since it's not changing the type but only the name (I think it's not common to call this annotation with class' property names 🤞 ) so users will continue to call like this AvroProp("key", "value")
.
But let's make a major increase when releasing this PR.
Oh we missed something important @thake , that prevent us to have like a common AvroProp
with single value and list value: We can have additional props like this "key": "value"
or "key": ["value"]
. How we can differentiate list and non-list added props ?
So new idea, we keep AvroProp
having a single value (no vararg so it remains untouched) and another AvroListProp
(same as the previously AvroProps
) :
annotation class AvroListProp(val key: String, values: Array<String>)
Something like this (below) will work nicely, but has it's own limitations.
@SerialInfo
@Target(AnnotationTarget.PROPERTY, AnnotationTarget.CLASS)
annotation class AvroProp(val key: String, val value: String = "", val values: Array<String> = [])
Mainly non-blank value
must take precendence over values
, otherwise it will end up in situation where same prop is declared twice and AvroRuntimeException: Can't overwrite property: cold
is being thrown.
Also is slightly error prone, as the annotation allows for putting both value
and values
at once without any complaint.
So maybe, let's go back to my initial proposal with minor improvements.
@SerialInfo
@Target(AnnotationTarget.PROPERTY, AnnotationTarget.CLASS)
annotation class AvroProp(val key: String, val value: String)
@SerialInfo
@Target(AnnotationTarget.PROPERTY, AnnotationTarget.CLASS)
annotation class AvroProps(val key: String, val values: Array<String>)
As for AvroListProp
vs AvroProps
, I think AvroProps
is consistent with already existing annotations - AvroAlias
& AvroAliases
Let's go to your initial proposal yes. Just the name, I would prefer AvroListProp
against AvroProps
, since AvroProps
is ambiguous, like "is it a list of props". Aliases
is different where we should fix it and remove it prior to a AvroAlias
with a vararg value. I'll create an issue to track it !
I'll have a look into it tomorrow.
@thake it has been changed to AvroListProp
👌
@thake, @potfur thanks for your patience. I rechecked the avro spec on this topic and while the current proposal would solve the discrepancy for array values, the specification explicitly allows additional properties of any type (i.e., boolean, number, object, array, string, null). Maybe we should consider this for this PR.
We have a similar situation with the @AvroDefault
annotation. Maybe we can apply the same solution so that the API feels coherent.
Proposal:
@SerialInfo
@Target(AnnotationTarget.PROPERTY, AnnotationTarget.CLASS)
@Repeatable
annotation class AvroJsonProp(
val key: String,
val jsonValue: String)
//Usage
data class TypeAnnotated(
@AvroJsonProp("array", """["play","blub"]""")
@AvroJsonProp("obj", """{"javascript":"object"}""")
val str: String
)
@thake I'm not really a fan of this, since json in strings are source of errors and ambiguous, and regarding performances it requires parsing the value each time with ser/deser. We will have to make a cache etc...
Why not just having a supplier class, to have a typesafe implementation ? See the following :
annotation class AvroGenericProp(
val key: String,
val valueSupplier: KClass<out AvroPropSupplier<*>>,
)
interface AvroPropSupplier<T> {
val serializer: SerializationStrategy<T>
fun getValue(): T
}
//usage
data class TypeAnnotated(
@AvroGenericProp("array", CustomPropSupplier::class)
val str: String
)
@Serializable
data class CustomObject(val field1: String)
object CustomPropSupplier : AvroPropSupplier<CustomObject> {
override val serializer = CustomObject.serializer()
override fun getValue() = CustomObject("value")
}
That way we are sure that we can put whatever we want as inside the field value, it can be static or we can pass some context to the getValue (like the annotated field's value, the parent class, etc...). And it's not so complex to implement, what do you think ?
I followed @thake suggestion, as it is more as Avro actually suggests with handling this case, and it also brings bit more flexibility.
About @Repeatable
and type safety - I think for those two separate PR would be better, as those would affect also other annotations - just to be consistent, otherwise it would be weird.
I don't understand, avro only suggest to handle all the types in additional props. Why we continue using json while we have pure kotlin to write objects ? 🤔
@Chuckame I get your point, and for sure it would be better to use proper types.
On the other hand, the AvroDefault
annotation already uses JSON strings, so AvroJsonProp
is aligned and consistent what's already there (I like being consistent, even when it is suboptimal - gives some predictability and less cognitive load).
IMHO, doing improvements on types, and adding Repeatable
, would be separate work - especially that forcing types would be a breaking change. So far, all is still backwards compatible - introduces new capability while preserving present.
I have exactly the same concern about AvroDefault
btw 🥲
Let's do AvroJsonProp, but it won't be that performant.
I'll create an issue to discuss about "type-safe" AvroDefault, AvroJsonProp and other jsons annotations.
Given the backward-compatible nature of these changes, could we get a new release out once @thake has approved this PR?
I have exactly the same concern about
AvroDefault
btw 🥲Let's do AvroJsonProp, but it won't be that performant.
I'll create an issue to discuss about "type-safe" AvroDefault, AvroJsonProp and other jsons annotations.
It is a kind of trade-off between "ease of use" in combination with pre-existing schemas and type safety. Using JSON in the annotations makes it easy to model your kotlin code so that it matches an existing schema; you just copy and paste the value from the asvc schema file.
Performance-wise, it shouldn't impact serialization and deserialization. The annotations' JSON is only interpreted whenever a schema is derived from the kotlin class. The schema is usually only derived once for every serialized class (responsibility of the user) and not on every serialization/deserialization.
@thake Since we'll have the @Language
annotation to validate the content, it's an acceptable trade-off 🚀
For the performances, If we use the Avro
method toRecord(serializer, obj): GenericRecord
then we regenerate the schema each time we serialize. Maybe we could add some doc to advice the use of toRecord(serializer, schema, obj)
in an intensive serialization of the same schema.
FYI, I'm playing with @Repeatable
with kotlinx serialization, and it's only taking the latest annotated value, even if it should be already fixed. See https://github.com/Kotlin/kotlinx.serialization/issues/2099, it works only with kotlin 1.9 I don't know why
@Chuckame / @thake Alas @Language("json")
applied to @AvroJsonProp
appears to only take effect in IntelliJ IDEA within the avro-kotlin/avro4k
project itself, not when it's consumed from another project.
There are a number of open issues such as KTIJ-16874 and IDEA-173477 related to the processing of @Language
as a meta-annotation. You may also want to vote them up.
@franckrasolo as I can read, both are related not to our case: we are directly using this language annotation on the value parameter. Those issues are related to meta annotations, implying that annotations (like @Language
) are annotating an annotation A
, where this annotation A
is used on a parameter, so all the annotations on A
will be applied to the parameter.
Also tested AvroJsonProp
on my computer and the json language injection is not working. After a quick check, tldr: replace @Language("json")
by @Language("JSON")
.
Longer story: language injection is done by intellij, and we can find all the languages in Parameters > Editor > Language injections
. We should set a value found inside the Language
column, where we can only find JSON
and not json
, maybe it's case sensitive! I'll do the pr
Yes, not an exact match, but the number of issues makes me wonder how stable the processing of @Language
is across a wide range of use cases: host languages x injected languages x IDE flavours, and so forth.
Avro AVSC allows for custom annotations as shown here.
Same functionality can be partially achieved by using
@AvroProp
annotation, but only for single value annotations.For cases - like the linked issue, where a list of values is needed, existing annotation is not enough.
Therefore I'm suggesting adding
@AvropProps("name", "valueA", "valueB", "etc")