FasterXML / jackson-future-ideas

Repository for SOLE PURPOSE of issue tracker and Wiki for NEW IDEAS. Please: NO BUG REPORTS.
18 stars 6 forks source link

Support parameterized type deserialization by passing JavaType to deserializer #57

Open pgoodwin opened 3 years ago

pgoodwin commented 3 years ago

Is your feature request related to a problem? Please describe. I want to be able to (de)serialize objects of type com.github.michaelbull.result.Result which is a type I don't own and which has no Jackson annotations. I want to do custom serialization so I can communicate with services I control written in Ruby that have Json support for equivalent Result types, which means that I'd like to keep Java class data off the wire. I know the type parameters for the Result object I want to deserialize at the point in time when I call ObjectMapper.readValue() and pass a JavaType with that information into that call.

Because I want to deserialize multiple flavors of Result I instantiate multiple instances of its deserializer passing the type information for each flavor into the constructor. I cannot register these different instances with the same ObjectMapper, however because SimpleDeserializers registers deserializers based on Java Class objects instead of JavaTypes.

This is a Spring application (written in Kotlin) and I would like to be able to have a singleton ObjectMapper that is configured for use throughout the entire application. As it stands I have no way to configure one that can deserialize multiple flavors of the same parameterized type.

Describe the solution you'd like I want to receive the JavaType of the object to be deserialized in the deserialize() call, either as a new parameter or as part of the DeserializationContext parameter. That way I can use the same instance of the deserializer to deserialize multiple flavors of the same parameterized type by examining the reified values of those parameters at runtime.

Usage example I have a toy project that illustrates what I'm trying to do except outside of the context of Spring here: https://github.com/pgoodwin/serialization-experiment

Additional context

cowtowncoder commented 3 years ago

I am not sure I really understand what is being requested here so I will start with noting that what SimpleDeserializers does is not based on fundamental limitation: methods of Deserializers do take JavaType for locating type for which deserializer is requested (except for case of tree/node type).

Since the type is used for locating initial deserializer (possibly modified by resolution via resolve(), and contextual changes via createContextual()), this piece of metadata is not considered necessary to carry through after construction. Design internally is based on deserializing knowing "its type" (type passed as target type when getting the instance) and is consciously not passed for deserialize(); with the exception of polymorphic handling.

Sometimes developers wish type was available, to allow for general-purpose multi-type deserializers: this is not provided and I do not see benefit of adding machinery to store an pass type information (esp. via new argument; but if passed contextually, requires book-keeping as well). If deserializers want to know it, they need to retain it when first requested.

I'll leave this issue open but I do not see much benefit in trying to do this, adding complexity and overhead.

cowtowncoder commented 3 years ago

(transferred to "jackson-future-ideas" since this would be a major rewriting of much of databind internals, touching every existing deserializer implementation as well as most abstractions)

pgoodwin commented 3 years ago

The context of this is parameterized types. For example the type I ran into it with is: Result<V, E>. To complicate matters Result is a sealed Kotlin class, which for our purposes is equivalent to an abstract class, so that the only object instances of Result type are subtypes (either an Ok or an Error). I don't own the class so I can't add annotations to it, so I need to write my own serializer/deserializer for it. Ideally I would like to write one deserializer that would work for all Result objects, but I can't because Result isn't really a single type -- it's a family of types -- and I can't deserialize it unless I know exactly which member of that family I'm deserializing to. I don't need a separate deserializer for every possible member of this family, a single deserializer will do, but only if I have access to that type information.

In the example I pointed you to I wrote a single deserializer, but I had to register a separate instance of it for every combination of types I used with Result. It would have been better if I could have instantiated a single deserializer that would receive the type information it needs at run time.

I don't think the change I'm asking for needs to be as large as you've said. If the type data were put into the context then no existing deserializers would need to change -- only future deserializers would (optionally) use that data. If I'm not mistaken each deserialization gets its own context so there wouldn't be concurrency issues. In the code I've traced through the JavaType object was present through almost the entire call chain so it seems like there's ample opportunity to add it to the context just before the deserializer is called. I agree that adding it to the deserialize() call would be very disruptive. You'd almost want to add an entirely new deserializer type to the API to accommodate that, and it would still be a big change.

cowtowncoder commented 3 years ago

I am confused by "I had to register a separate instance of it for every combination of types I used with Result" -- this is not necessary when implementing Deserializers, for 2 distinct reasons:

  1. Deserializers are located by callbacks so you get to (have to implement) logic: you construct deserializer as needed
  2. In case of composite types which Result may be (not sure based on above), main type would recursively look for deserializers for content types (values contained with in)

As to JavaType being available this is sort of available already from

DeserializationContext.getContextualType()

but only during construction of deserializer: this may or may not be useful in your situation.

pgoodwin commented 3 years ago

Here's the code I'm working with. Ok<V> and Err<E> are subtypes of Result<V,E>. The project is here: https://github.com/pgoodwin/serialization-experiment/tree/java-type-deserializers It's a toy project focusing on just this issue.

data class MultiValuePayload(val first: String = "First value", val second: String = "Second value")
data class MultiValueError(val code: String, val reason: String, val status: String, val message: String)

fun main() {
    // The values we'll be serializing
    val successObject = Ok(MultiValuePayload(second = "some other value"))
    val errorObject = Err("No bueno")
    val multiValueErrorObject = Err(MultiValueError("invalidState", "insufficientdata", "418", "I'm goin' home"))

    val resultArray = arrayOf(successObject, errorObject)

    // Initialize the mapper with our serializers
    val mapper = ObjectMapper()
    // Define types in a way that Jackson understands. This is a downside of Java's type erasure
    val multivaluePayloadResultType = mapper.typeFactory.constructParametricType(
        Result::class.java,
        MultiValuePayload::class.java,
        String::class.java
    )
    val multiValueErrorResultType = mapper.typeFactory.constructParametricType(
        Result::class.java,
        String::class.java,
        MultiValueError::class.java
    )

    val resultSerializer = ResultSerializer()
    val deserializers = JavaTypeDeserializers(mapper.typeFactory)
    deserializers.addDeserializer(
        multivaluePayloadResultType,
        ResultDeserializer(
            MultiValuePayload::class.java,
            String::class.java
        )
    )
    deserializers.addDeserializer(
        multiValueErrorResultType,
        ResultDeserializer(
            String::class.java,
            MultiValueError::class.java
        )
    )
    val kotlinModule = KotlinModule()
    kotlinModule.setDeserializers(deserializers)
    kotlinModule.addSerializer(Result::class.java, resultSerializer)
    mapper.registerModule(kotlinModule)

    // Serialize and deserialize the array
    val arrayAsJson = mapper.writeValueAsString(resultArray)
    println("Array of results as Json: $arrayAsJson")

    val resultArrayType = mapper.typeFactory.constructArrayType(multivaluePayloadResultType)
    val reconstitutedArray = mapper.readValue<Array<Result<MultiValuePayload, String>>>(arrayAsJson, resultArrayType)
    println("reconstitution successful: " + (reconstitutedArray[0] == successObject && reconstitutedArray[1] == errorObject))

    // Serialize and deserialize the other error object
    val anotherErrorJson = mapper.writeValueAsString(multiValueErrorObject)
    println("A different error json: $anotherErrorJson")

    val anotherReconstitutedError =
        mapper.readValue<Result<String, MultiValueError>>(anotherErrorJson, multiValueErrorResultType)
    println("reconstitution successful: " + (anotherReconstitutedError == multiValueErrorObject))
}
cowtowncoder commented 3 years ago

Looks like KotlinModule adds convenience methods addSerializer() that uses SimpleModule under the hood.

But as I said a few times, SimpleModule is not designed to work with parametric types.

Please instead implement Module that registers custom Deserializers wherein callback gets full JavaType and you can extract type information you need. This works better than trying to add static mapping.

pgoodwin commented 3 years ago

That's what I did: I instrumented the KotlinModule with my own JavaTypeDeserializers object which uses JavaType, but there's no way to pass the JavaType to the Deserializer at run time, so I still have to register a different Deserializer for each unique combination of type parameters. It's this support for parameterized types that I'm asking for with this feature.

cowtowncoder commented 3 years ago

No, I don't think that makes sense: you should construct deserializer with JavaType passed when asking for one to be constructed. It is simple to do and avoids rest of the system having to carry through information that may or (more often) may not be needed -- most deserializers are not registered for different types.

pgoodwin commented 3 years ago

Let me see if I understand what you're saying: if I have a parameterized type like Result<V,E> and I want to be able to use it with a bunch of separate types then it make sense for me to register a separate deserializer instance for each one. So maybe I'd have some different kinds of results like Person, Location, Product, Price. And I might have some different kinds of errors that I might want to report like, NotFound, Forbidden, Timeout, InternalError. So then I'd register deserializers for every combination: Result<Person, NotFound>, Result<Person, Forbidden>, Result<Person, Timeout>, Result<Person, InternalError>, Result<Location, NotFound>, etc... for all 16 possibilities? It's definitely possible to do, I just thought it would be valuable to be able to register a single deserializer that could handle every possible combination no matter how many there are.

cowtowncoder commented 3 years ago

The way callbacks work is that you create specific deserializers on demand -- and yes, you will end up with all permutations, potentially. This is how system handles for CollectionDeserializer, MapDeserializer, array deserializers (and similarly for serializers).

That is the fundamental design of the system; it is expected that (de)serializer knows type it is being used for when constructed and initialized.