twitter / finatra

Fast, testable, Scala services built on TwitterServer and Finagle
https://twitter.github.io/finatra/
Apache License 2.0
2.27k stars 405 forks source link

Allow for deserializing Optional fields differently when absent vs null #581

Closed MrGigabit closed 2 years ago

MrGigabit commented 2 years ago

Is your feature request related to a problem? Please describe.

I work in a code base where we have fields like Option[Name] in our case classes. The object mapper parses absent fields and null fields the same.

For example on the following case class

case class Profile(
    id: String,
    name: Option[Name]
)

The example JSON will deserialize to the same thing, the name field will resolve to None.

{
    "id": "123abc",
    "name": null
}

Describe the solution you'd like

I would like for Example 1 to deserialize name to Some(null), and for Example 2 to deserialize name to None.

Describe alternatives you've considered

I would think that I would be able to specify an annotation on the field that would differentiate them, but I haven't found anything in the docs for either Finatra or Jackson to do so.

We have built a custom deserializer like

class NameDeserializer extends JsonDeserializer[Name]

that we then include in a module that gets configured on the object mapper with additionalMapperConfiguration, and this does allow us to access the inner field. The problem is, that because the field is wrapped in an Optional, it will always resolve to None before I can determine how I want to handle it for example 1 and 2.

Additional context

This matters to us because we would like to differentiate the requests sent to update endpoints, where a null field indicates to set the value to null in the underlying database, and an absent field indicates to keep the current value in the database.

cacoco commented 2 years ago

@MrGigabit hey, thanks for the feature request.

I believe this is actually a Jackson feature that can be tweaked since it is parsing the null value from the JSON and hands it to our deserializer to convert into an object. With the jackson-module-scala, I think we may see it as a None before we can deserialize but would need to check. I believe you want to set the Jackson DeserializationFeature.ACCEPT_EMPTY_STRING_AS_NULL_OBJECT to true when you configure the object mapper.

With that, this is possibly something that is a code smell. Some(null) isn't idiomatic in Scala and I think it is very hard to reason about what that means, FWIW, and is probably something to avoid if possible. Secondly, Option[T] literally signifies that the T value does not have to appear in the JSON. If a field will appear and appear as null, then it isn't an optional value. If you expect null values then don't declare the field as optional (as potentially not existing in the incoming JSON).

For your examples, I would argue these are different deserialization cases that then require different object representations.

Case 1: null

{
    "id": "123abc",
    "name": null
}

translates to:

case class Profile(
    id: String,
    name: Name
)

Case 2: absent

{
    "id": "123abc"
}

translates to:

case class Profile(
    id: String,
    name: Option[Name]
)

If you have the two cases being conflated in the JSON being sent it will be hard to model this appropriately but if you tweak the Jackson deserialization configuration that may lead you to what you want.

MrGigabit commented 2 years ago

@cacoco Wow, I can't tell you how much I appreciate the prompt response! My team and I are having a real head scratcher moment on this, and I posted here somewhat as a last resort.

I see your perspective, your argument being: if a field needs to be accepted as null in the request, fundamentally that field isn't an Optional field, and thus the field representing it should not be wrapped in one.

If I were to follow that line of thought, what then is the idiomatic way to process such objects in update requests? We feel like we can't be the first ones to bump into this problem, but the lack of specific guidance on this issue in documentation (both Finatra and Jackson) leads us to believe that we must just be going about it the wrong way.

Let's say that the field truly must remain an Optional field for our use case. What's the best way to interpret that we want to set the underlying database representation to null, vs not to require the field to be passed by the client if they don't intend to update the field, i.e. they want to leave whatever the current value of the field may be as is?

If the answer is "don't do that" haha, at least we will know! We are just looking for the correct way.

One thought we had was that we would interpret PUT requests with null or absent fields to mean that the requester wants said field set to null, and have PATCH requests with null or absent fields to mean that the requester doesn't want to update that field. This doesn't sit perfectly with us though, as that means the only way you can null out a field is via PUT, which requires the entire object and the object must be in sync with the current database representation (sans the nulled field of course)

cacoco commented 2 years ago

Let's say that the field truly must remain an Optional field for our use case. What's the best way to interpret that we want to set the underlying database representation to null, vs not requiring the field to be passed by the client if they don't intend to update the field, i.e. they want to leave whatever the current value of the field may be as is?

Option is basically the superset of null and absent. I don't think there is a way to distinguish an Option field between null and absent as I don't believe this is supported by Jackson: https://github.com/FasterXML/jackson-databind/issues/578#issuecomment-58443025 and more recently: https://stackoverflow.com/questions/70771210/jackson-deserializes-absent-optionals-as-nulls-instead-of-optional-empty. I may be reading that wrong.

I don't have a good idea for a workaround that doesn't involve changes to the case class definitions. This case is interesting and it is possible we may be able to patch it in a way to properly distinguish absent from null. This is the line that returns null in both cases: https://github.com/twitter/util/blob/ce78eeb4eaed4408f1e0eff8de605a1927335395/util-jackson/src/main/scala/com/twitter/util/jackson/caseclass/CaseClassField.scala#L229. It's null if the key doesn't exist and it's null if it does and has a null value so for Scala idiomacy, we collapse those to the same result, returning the default value for the field type (or an exception) which for Option is a None.

Let me see if there's something we can do that doesn't completely break everything.

cacoco commented 2 years ago

@MrGigabit I thought of a possible workaround if you can update the case class definition. You should be able to customize the deserialization for the given field to allow you to differentiate. Something like this:

You'll need to create a "sentinel" Name instance which represents a null Name, e.g.,

object NullName extends Name

Then given this case class:

case class Profile(
    id: String,
    @JsonDeserialize(using = classOf[NullableNameDeserializer]) name: Option[Name]
)

Then define the NullableNameDeserializer:

class NullableNameDeserializer extends StdDeserializer[Name](classOf[Name]) {
  override def deserialize(jp: JsonParser, ctxt: DeserializationContext): Name = {
    val jsonNode: ValueNode = jp.getCodec.readTree(jp)
    if (jsonNode.isNull) {
      NullName
    } else {
      new Name...
    }
  }
}

The deserializer is only invoked when a value is passed, otherwise, it will be skipped, so it is only hit in the explicitly passed null case which you'll represent with the "sentinel" instance of a Name instance which represents to set a null in the database.

Examples of using @JsonDeserialize are in the test: https://github.com/twitter/util/blob/develop/util-jackson/src/test/scala/com/twitter/util/jackson/ScalaObjectMapperTest.scala#L1795

Looking into it, I don't think there's much we can change in the framework to deal with this case. Hope this helps.

mosesn commented 2 years ago

@MrGigabit I'm going to close this issue for now, but please feel free to reopen or file a new issue if something related to this comes up. Thanks!