Closed jingibus closed 6 years ago
cc: @kangzhang @tkieft
Almost forgot. Here's a link to example generated source: https://gist.github.com/jingibus/b877e28c376cc64b5b77ac7328e8214c
hey @jingibus, thanks for putting up the PR!
The logic looks solid but the API diverge a bit from our current design: we are trying to have more work done in compile time and avoid centralized bottleneck that may not scale well when the number of types increases(e.g. the registerHandler
logic).
May I know more about the problem that you are trying to solve so that we can explore the best solution for it? E.g. Is it mostly for heterogenous types or you have some other constraint?
If it's just for heterogenous types, wondering if you have tried using custom serializer & deserializer(which the library already support) and how does it work for you usecase? E.g.
interface Bar {
}
@JsonType
class Foo {
@JsonField(valueExtractFormatter="BarDeserializer", serializeCodeFormatter="BarSerializer")
Bar bar;
}
Thanks for taking the time to consider these changes! Here's a tl;dr version of the problem:
I work on a module that needs to be able to serialize objects that are passed in from client modules. These objects need to store the client module's data, so our module can't refer to their serializers statically.
I'll go into more detail on the problem in a second, but first, to address your concerns:
HashMap
with a String
key will not scale. Maybe I have more to learn about the kinds of scale folks are putting this library through. If this is a serious concern, though, we can talk about addressing it.Okay, now the longer explanation. I work on a module in an Android app, we can call the module Arthur, that has a serialized object store at its heart. A client makes a request, the request goes into Arthur's store, and at some later date the request is serviced. This is currently based on ig-json-parser.
Each client that depends on Arthur needs to stash their own client-specific data in the store. The way we handle this right now is that each client adds a reference to the data they need inside the store's main model object.
This is causing two problems:
To cut Arthur away from the clients, clients have to be able to build their own objects and save them in our store. Right now, that's not possible in a graceful way.
Cool, yeah, if you need to avoid the dependency from the model Interface to the model implementation, some runtime registration is necessary.
The current solution ties a bit too close to your current approach(the idea to use runtime binding, the idea to separate model interface and model implementation, the idea to do lookahead parsing). I hope the framework could be flexible on these decisions and allow people to make their own choices. Different approaches could be provided as patterns(e.g. we can provide documentation and examples following your approach) instead of features.
It looks like custom serializer and deserializer can get you almost there, except that we have to duplicate the annotations(and properties) between different reference point of ModelInterface
.
To fix the problem, I'm wondering if we could just support interface as JsonType
and allow(/require) people to provide custom serializer and deserializer for it. E.g.
@JsonType(
valueExtractFormatter = "BarHelper.read(jsonParser)",
serializeCodeFormatter = "BarHelper.write(jsonParser)"
)
interface Bar{
}
This way the serializer and deserializer values don't have to be repeated in different @JsonField
annotations.
The runtime binding logic will be moved to BarHelper
. If people can afford hard referencing to the implementation class, they can avoid the runtime binding as well.
Similarly, the lookahead parsing will also be optional/configurable.
The other unspoken benefit is performance and cleanness. The Bar__JsonHelper
we generate will be virtually a pointer to the actual implementation. Sophisticate post processing tools(proguard/dexguard/redex) can likely remove these classes and functions which means almost zero overhead introduced.
Thoughts?
You have a good point about configurability. I think that proposal should be easy enough to add to the PR. Would it make sense to respect that attribute everywhere, not just in interfaces? At a glance, I don't see why not.
Now: should there be an out-of-the-box implementation, or not? If the library shouldn't have one, then the quality of this implementation is beside the point. Here's why I think it would be good to have one:
This is a non-trivial integration. This scenario can't be solved by any programmer without at the very least serializing out the type name and reading it in from JSON. They don't have to do it this way, of course, but they must decide on an object format, they must figure out how to use Jackson's JsonGenerator to save it out, and they must figure out how to use JsonParser and the JsonHelper API style to read it back in. They must also implement some kind of type dispatch.
So this is an integration that requires reaching much further under the hood and building more moving pieces than you need for, say, enums. The API will be useless without an implementation, and the example code will have a lot of boilerplate. The implementation can always be omitted if valueExtractFormatter
etc are provided on @JsonType
, of course.
Re: this implementation, one minor point: I think this is just a terminological issue, but for the sake of clarity: there is no lookahead in this implementation. Dispatch is implemented the exact same way the current parser handles field parsing: by reading in a string from the JSON stream, and then using that string to choose the parsing code to switch over to. The difference is that this format is implementation-specific, while the field format is a JSON standard.
Ah, sorry that I overlooked the sample JSON payload and thought you were using lookahead parsing. You are right, it's still streaming!
I might be biased since I know how the code generation works internally, but it does feel like some thing that can be reused thru patterns and helper classes easily. I might be missing something, but I'm not sure if code generation logic is necessary to make all this possible. E.g. following code could almost achieve the same thing?
// in priviate module, which we don't want people to reference to
@JsonType
class BarA implements Bar{
}
@JsonType
class BarB implements Bar {
}
@JsonType
class BarWrapper{
@JsonField("fooBar")
BarA mBarA;
@JsonField("barFoo")
BarB mBarB;
Bar getBar() {
if (mBarA) {
return mBarA;
}
return mBarB;
}
}
// in public module
@JsonType(
valueExtractFormatter = "BarWrapper__Helper.read(jsonParser).getBar()",
serializeCodeFormatter = "BarWrapper__Helper.write(jsonParser)"
)
interface Bar{
}
The other feedback is that I still think the proposed approach is too tied to your specific use case. Maybe a few examples(in my imagine) could illustrate things better:
1) when there is an 1 to 1 mapping between interface and model, and people just want to hide the model implementation.
// package visibility, external classes can't access
@JsonType
class BarImpl {}
@JsonType(
valueExtractFormatter = "BarImpl__Helper.read(jsonParser)",
serializeCodeFormatter = "BarImpl__Helper.write(jsonParser)"
)
interface Bar{
}
2) when people wants to use lookahead parsing to handle heterogenous types.
In these cases, the payload structure will be different from your proposal here.
I think supporting interface is going to be useful and will unblock all these use cases. Instead of supporting all of them in ig-json-parser, I'm more leaning towards letting developers to decide for now. If we do see one or many use cases to appear a lot, we can revisit and build support in ig-json-parser? Happy to discuss more.
Interesting. From a data standpoint, for my scenario, this does work:
package com.uber.android;
@JsonType
class CarRequest implements Request {
}
package com.foursquare.android;
@JsonType
class RestaurantRequest implements Request {
}
package com.requestqueue;
@JsonType
class RequestWrapper {
@JsonField("fooBar")
CarRequest mBarA;
@JsonField("barFoo")
RestaurantRequest mBarB;
Bar getBar() {
if (mBarA) {
return mBarA;
}
return mBarB;
}
}
package com.requestqueue;
@JsonType(
valueExtractFormatter = "RequestWrapper__Helper.read(jsonParser).getBar()",
serializeCodeFormatter = "RequestWrapper__Helper.write(jsonParser)"
)
interface Request {
}
If this were a public library, then obviously this would break. But it's not, so it can probably work.
There's one more thing the dynamic implementation enables that this does not, though, which is freedom for callbacks. CarRequest
cannot have any code that calls back into the requestqueue
system here. This would be very nice to have for my scenario, as it opens up the API design a bit.
I'm sold that your proposed API is useful enough on its own. One last counterproposal:
My implementation can be modified to avoid code generation and hook into your proposed mechanism. The result would be two files: a DynamicDispatch<T>
class that would contain the code that's currently generated, and a TypeNameProvider
interface with at getTypeName
that would take the place of the JsonTypeName
attribute. And then it could be wired up like so:
public class RequestHelper {
public static final DynamicDispatch<Request> DISPATCH = new DynamicDispatch<>();
...
}
@JsonType(
valueExtractFormatter = "RequestHelper.DISPATCH.read(jsonParser)",
serializeCodeFormatter = "RequestHelper.DISPATCH.write(jsonParser)"
)
interface Request {
}
I'll go ahead and make the change to strip out the existing generated code in favor of the valueExtractFormatter
/serializeCodeFormatter
API.
The new approach is sufficiently different that I'm going to close this PR. I'll have a new one up soon.
This PR adds support for serializing objects from an interface reference. Using this, you can serialize out e.g. a list of heterogenous object types.
Using the Public API
Say you have a field that refers to an instance of
MyInterface
.First step is to add a method to the interface to return the type name:
Then in your implementation of that interface, you yield the type name:
Then before you serialize, you register a serialization handler with the generated code:
With that,
MySerializableObject
can successfully serialize and deserialize:Under the Hood
Not seen above is the JSON representation. For this to work, the type has to be represented in the serialized JSON. Not only that, but some ordering must be guaranteed if we want to go with the existing single-pass parse style. This is because we must read in the type first before we dispatch to that specific type's parser.
The only JSON aggregate that guarantees ordering is the list. So interfaces are represented as lists, where the first entry is the type information, and the second entry is the object data: