twitter / finatra

Fast, testable, Scala services built on TwitterServer and Finagle
https://twitter.github.io/finatra/
Apache License 2.0
2.27k stars 405 forks source link

finatra-jackson: add @NullValueAllowed annotation and functionality #562

Closed cerveada closed 3 years ago

cerveada commented 3 years ago

Problem

In some cases null is a valida value for a field. Currently there is no way how to allow nulls just for some specific field.

Solution

Add field annotation and accept json null for all annotated fields.

CLAassistant commented 3 years ago

CLA assistant check
All committers have signed the CLA.

cacoco commented 3 years ago

Hi @cerveada thanks for the PR! Can you expound on the cases where null would be an acceptable value for a JSON field? If I understand from extrapolating, the idea is that the field is required (so use of an Option is not desired) but the value of a null type is allowable? I'm having a hard time understanding cases where this is desirable vs the risk for NPEs it introduces.

In general, in Scala nulls are a bit of a persona non grata and I don't think we want to break with being idiomatic about nulls here. A little more context may help us understand. Thanks!

cerveada commented 3 years ago

Hello, You understand it correctly. In our case we are storing a Spark job lineage including constants used in spark expressions. Such a constant can be a null. So for us the null is just another value. Using Option would make the field optional, which is not the case. We want to know what value was used.

Generally we avoid nulls as well, but sometimes the data is coming from somewhere out of your control. That's why I think the annotation solves this well. The default behaviour is unchanged and in rare cases like this the annotation can be used.

cacoco commented 3 years ago

@cerveada I see. I'm curious why a custom deserializer would not work here? That is create a custom "NullableDeserializer" and then annotate the field with @JsonDeserialize(using = classOf[NullableDeserializer])

Or register it for a specific type if that is desired: https://twitter.github.io/finatra/user-guide/json/index.html#adding-a-custom-serializer-or-deserializer

cerveada commented 3 years ago

If you look at CaseClassField's parse method, there is a check for null. The desserializer is not even called in case of null, instead defaultValueOrException is called and that throws the exception.

Unless I am missing something in the code, custom deserializer will not help here.

And using default value would make the json field optional.

cacoco commented 3 years ago

@cerveada if you use a custom deserializer then the Finatra CaseClassDeserializer should not get invoked for the type and the CaseClassField code isn't then hit -- at least that is the intention. If there is a bug here it'd be great to address that.

cacoco commented 3 years ago

@cerveada looking at it, I think it just means we need to resolve any deserializer before attempting to assert the JSON value is not null.

cerveada commented 3 years ago

Well I could use a custom deserializer for the whole case class, but that would mean implementing again all the null checking logic and more.

Using custom deserializer for just the caseclasses' field seems to be not possible.

looking at it, I think it just means we need to resolve any deserializer before attempting to assert the JSON value is not null.

Will it help though? The value will be null before and also after deserialization so the check will fail no matter when that happens.

I think the root of the issue is that the null check is done in the case class deserializer and not in the field deserializer. As a result this makes it impossible to override the null behaviour by setting your own field deserializer.

cacoco commented 3 years ago

I think the root of the issue is that the null check is done in the case class deserializer and not in the field deserializer. As a result this makes it impossible to override the null behaviour by setting your own field deserializer.

Right. What I'm pointing out is the more correct behavior should be that we do not assert anything on the field before resolving and then using any custom deserializer. The point of specifying a custom deserializer is to opt-out of the functionality of the current deserializer. And yes, it's a little bit of work to move resolution of the deserializer a little earlier but I think that is more correct a bit more palatable than trying to add any explicit null support into the framework. It should be such that you can opt into how you want to deserialize a field and the CaseClassDeserializer should not necessarily interfere with that, even if it means you want to create a deserializer that explicitly accepts a null value for a field. It doesn't mean you need to create a whole new deserializer for the class, just the constant field(s) you mentioned that you cared about being null. You would annotate these case class fields with @JsonDeserialize specifying your custom deserializer which allowed for a null to be used as the value. The CaseClassDeserializer would be patched to properly defer validation to either the resolved custom deserializer or until after resolving no custom deserializer exists.

I've filed a ticket for us to take a look at this internally. Thanks.

cacoco commented 3 years ago

@cerveada I have a patch which should allow for you to do something like this:

public enum TheEnum {
    ONE,
    TWO;
}

class NullableCarMakeDeserializer extends StdDeserializer[TheEnum](classOf[TheEnum]) {
  override def deserialize(jp: JsonParser, ctxt: DeserializationContext): TheEnum = {
    val jsonNode: ValueNode = jp.getCodec.readTree(jp)
    if (jsonNode.isNull) {
      null
    } else {
      TheEnum.valueOf(jsonNode.asText())
    }
  }
}

case class WithNullableEnum(
  nonNullableValue: TheEnum,
  @JsonDeserialize(using = classOf[NullableTheEnumDeserializer]) nullableValue: TheEnum)

val json =
      """
        |{
        |  "non_nullable_value": "ONE",
        |  "nullable_value": null
        |}
        |""".stripMargin

val expected = WithNullableEnum(TheEnum.ONE, null)

mapper.parse[WithNullableEnum](json) should equal(expected)
cerveada commented 3 years ago

That sounds great. I will try it out when it's available. Thanks.

I guess this PR can be closed now.

cacoco commented 3 years ago

@cerveada the changes have landed in 715890a4715a0f58448de0a77a4c7b085449baa5. I'm going to close this PR now. Thanks!