Any way to globally opt-out of JSONB's "Must-Ignore" policy?

jsonbrobot commented 7 years ago

I am attempting to change the implementation of some code to use JSON-B instead of Jackson. In the process of doing so, I would like to match behavior so existing users are not impacted. One key behavior difference I have found is during deserialization Jackson will reject unrecognized properties, whereas JSON-B will simply ignore unrecognized properties.

I was looking around the API for a way to reject unrecognized properties (ideally via JsonbConfig). I posted this question on SO, but eventually I ran across this section of the JSON-B spec:

3.18 Must-Ignore policy When JSON Binding implementation during deserialization encounters key in key/value pair that it does not recognize, it should treat the rest of the JSON document as if the element simply did not appear, and in particular, the implementation MUST NOT treat this as an error condition.

So this explains why the default behavior is the way it is... but is there any way to opt-out of this? It would be a key part of migrating existing applications from Jackson to JSON-B.

jsonbrobot commented 6 years ago

Issue Imported From: https://github.com/javaee/jsonb-spec/issues/56
Original Issue Raised By:@aguibert
Original Issue Assigned To: Unassigned

jsonbrobot commented 6 years ago

@m0mus Commented Adding it to version 2.0 proposed features.

aguibert commented 5 years ago

Proposed API addition:

public class JsonbConfig {

    public static final String ALLOW_UNKNOWN_PROPERTIES = "jsonb.allow-unknown-properties";

    /**
     * Property used to specify whether unknown properties should be skipped or 
     * rejected during deserialization.
     *
     * Configures value of {@code ALLOW_UNKNOWN_PROPERTIES} property.
     *
     * The default behavior is `ALLOW_UNKNOWN_PROPERTIES=true`.
     *
     * @param allowUnknownProperties
     *      True means unknown properties will be skipped during deserialization.
     *      False means unknown properties encountered during deserialization will cause 
     *      deserialization to fail with a JsonbException.
     *
     * @return This JsonbConfig instance.
     */
    public final JsonbConfig withAllowUnknownProperties(final Boolean allowUnknownProperties) {
        return setProperty(ALLOW_UNKNOWN_PROPERTIES, allowUnknownProperties);
    }

    // ....

}

rmannibucau commented 5 years ago

Doesnt it miss a fallback setter as well? (Kind of map api)

aguibert commented 5 years ago

by fallback setter do you mean something like Jackson's @JsonAnyGetter/Setter? If so, I agree we should consider that feature for the next version of JSON-B, but it should be tracked separately from this issue.

rmannibucau commented 5 years ago

@aguibert yes, ok to be split but IMHO it must be both issues or none to make sense.

aguibert commented 4 years ago

@rmannibucau Recently came across this use case again and I agree with your original stance that we should consider both issues together.

What do you think about the following:

Add a new @JsonbLeftovers annotation that can be applied to a field or getter/setter method where the property is of type Map<String,Object> for example:

public class Customer {
  public long id;
  @JsonbLeftovers
  public Map<String,Object> leftoverProps;
}

Have a global config option as described above with JsonbConfig.withAllowUnknownProperties(true|false)
If a class has @JsonbLeftovers, leftover properties are always serialized/deserialized using that field, regardless of the global ALLOW_UNKNOWN_PROPERTIES setting.
If a class does NOT have @JsonbLeftovers, the behavior is determined by the value of ALLOW_UNKNOWN_PROPERTIES, namely:
- ALLOW_UNKNOWN_PROPERTIES=true (default) leftover properties are ignored
- ALLOW_UNKNOWN_PROPERTIES=false serialization/deserialization fails if leftover properties are found.

rmannibucau commented 4 years ago

@aguibert overall I agree, i'd just like something else than "leftovers" wich sounds like something else (maybe cause i'm not native but it sounds like undesired bindings instead of not explicitly bound). I'm used to "Any" or "Unbounded" wording but i'm sure something better can be found if not liked.

Also note that the binding does not have to be Map<String, Object> but can be anything (I'm thinking to JsonObject, JsonValue, JsonStructure very strongly but I guess a bean could be bound too as map value), so maybe just reuse jsonb (the class) "behavior" to bind it to the explicit type of the field saying "any other types behavior are undefined", no?. Would be the setter can be set(String,Object) set(JsonObject) set(String,Foo). Not sure it makes sense but avoiding Object can be neat after all.

aguibert commented 4 years ago

For the name, how about @JsonbExtras or @JsonbExtraProperties?

I think it's fine to have a few options on what to bind to like Map and JsonObject, but it needs to be a type that can contain an arbitrary amount of properties. I don't think JsonValue would work because it can only hold a single value. What if multiple extra properties are found? Is that an error? I'd prefer to keep the feature as simple as possible and not introduce extra error path complexities.

Regarding setters, I would like to avoid things like set(String,Object) because that is not consistent with how "known" properties of type Map are bound.

rmannibucau commented 4 years ago

@aguibert @JsonbExtraProperties sounds good to me.

About JsonValue the point is that we (as impl) own the creation of this instance so we can bind it directly in the form we want, it is just about the exposed type in the user object but ok to refine the supported type, can we just keep "other types behavior is undefined" instead of failling in the spec (impl can fail)?

How do you see the setter? A global setter (@JsonbExtraProperties foo(Map)?)? The advantage of the unitary setter is to avoid a temporary structure (map) and potentially objects which can't be in a map to be set (advanced domain objects). Guess we can support both, on an impl perspective it does not change much, if there are 2 params -> unitary, if one -> global, else fail.

wdyt?

aguibert commented 4 years ago

About JsonValue the point is that we (as impl) own the creation of this instance so we can bind it directly in the form we want, it is just about the exposed type in the user object but ok to refine the supported type, can we just keep "other types behavior is undefined" instead of failling in the spec (impl can fail)?

I'd rather have the spec be as explicit as possible, and not intentionally leave things undefined like this. The point of this feature is that it will be a "catch all" for extra unknown properties. The only valid use case I see for JsonValue field type is if the developer knows there will only ever be 1 unknown property. If that's the case, why use @JsonbExtraProperties at all? They could just bind it as a regular property.

How do you see the setter? A global setter (@JsonbExtraProperties foo(Map)?)? The advantage of the unitary setter is to avoid a temporary structure (map) and potentially objects which can't be in a map to be set (advanced domain objects). Guess we can support both, on an impl perspective it does not change much, if there are 2 params -> unitary, if one -> global, else fail.

I like the global setter -- it's simple and doesn't give the user too many different options.

rmannibucau commented 4 years ago

They could just bind it as a regular property.

If you assume you only have global binding it is likely true but if you don't and embrass per key binder it is very useful. Think about openapi and x-yyyyy properties, their type is undefined globally but can be defined per case so having the JsonValue enable to handle primitives but also objects.

I like the global setter

I don't dislike it but all impl use per key binders, mainly because it is way more performant on some payloads where you filter through a pattern/predicate the properties you let go through. You can't do it with a globalObject.xxx().stream() due to the memory limitations in some cases (saw it filtering gigs on dump imports). So I don't think we can bypass the per key binder but I agree the global one can be convenient.

So to summarize I think we should:

add @JsonbExtraProperties (with field and method support)
enable to deserialize or serialize (both) any value decorated with @JsonbExtraProperties
for method binding, method return type is ignored (i guess there is no reason to prevent fluent API there), method parameters are either one and all ignored values are bound if any or 2 and the first one is a String which is the attribute key and second can be any unwrapped type supported by Jsonb (unwrapped=guess there is no point supporting Optional since we'll never call it with Optional).
Subclass method extra properties method wins over parent method ones, and method wins over field (we don't want to bind both I guess)

Wdyt?

maald commented 4 years ago

Just some thoughts.

The original issue was "Is there a way to fail on unknown properties?"

If the payload includes an unknown property that simply means you don't even know about it in the first place and it is irrelevant to you model; if that property mattered to your model you would have coded it. what would you do with an unknown property? like your first question will be "what's its name, type etc"? like you can't write a legit code if you know nothing about your input. As far as you're concerned it could be a large and nested structures that just consume your memory space.

Unlike in javascript where you can dynamically build your model/object properties, in java your model properties are fixed at compile time. So if you want a "dynamic" model with unknown properties, why are you using classes in the first place! just stick to a plain old map of key-value pairs and avoid classes all together.

IMO the framework should be conservative about de/serializing models. It should have a room for corner cases where it allows you to deserialize an entire payload into a map (or better some data structure where it's easy to query about the value types). That way your model is KISS and for the corner cases, you basically fall back to explicit/coded de serialization. Again, if you have unknown properties, it means they are not in your model and therefore they are absolutely useless to your model regardless of whether a framework instantiated your model or you did it yourself.

rmannibucau commented 4 years ago

@maald I strongly disagree with that statement (even if I went through this thought in early times ;))

if you have unknown properties, it means they are not in your model and therefore they are absolutely useless to your model regardless of whether a framework instantiated your model or you did it yourself

Take OpenAPI case for example, it abuses of this exact pattern for all extensions in all levels of the object graph and you care about the extensions but you don't know what it is until you get them. Their processing can be - is ? - also dynamic (API policies) so at the end you must passthrough them and you handle them or not but it is not static in general. Indeed, this pattern is regularly used to communicate between services, gateways, handling version upgrades and feature pluggability so OpenAPI is not a particular case (or corner case) IMHO. Finally, loosing the typing of most of the model really feels like a regression from my point of view and to have suffered from that in some apps I really hope we support that (in particular when you see the needed code to support this feature and the fact most mappers have it, there is no real reason to not make it smooth).

maald commented 4 years ago

tbh I'm not familiar with OpenAPI and the object graph.

Maybe I'm missing something but if you get some dynamic json payload and its content is unknown to you at compile time then you can't query that unknown properties and the only thing you can do with it is log it or pass it along to someone else to get a known payload. Now in that case you don't really need a model at all and what you want to do is read the payload into a map (nothing wrong with that) if you want to access one or two known properties or just read it as a String and pass it along (not even worry about deserializing the payload in the first place)

When I think of a model I think of something that matters to correctness of my code. If I don't even know what I'm getting and it's just something I need to pass along to someone else then String or map will do.

With endless frameworks and their different configurations and rules and apis, simplicity and extensibility is KING. less is more. in other words, "I'll deserialize according to your model definition. If you don't expected a static payload/model then what you are really looking for is a map and here is one so you can do with it whatever you want". framework size, implementation, rules, and even learning for new comers become a breeze.

rmannibucau commented 4 years ago

@maald nop, this is true in static applications. Can you have an UI defining you must read x-vendor-config1 and x-vendor-config2 to do Y? Or more concretely, can you say that the rate limiting type is x-ratelimiting-type and that the ratelimiting window duration is x-rate-limiting-window-duration etc? Yes you can and a serious number of configurations are done this way. So all the built-in config is strongly typed, all the extensions are loosely typed. It does not remove any benefit from jsonb. It is also impossible to do it with a (de)serializer without re-abstracting the impl with a chain pattern which makes things complex compared to using a fallback attribute.

To summarize: do you always strongly know what you manipulate? No, this is likely only true for not extensible applications.

I also suspect you design your model from code whereas a mapper must enable to map any model so your statement

what you are really looking for is a map

just does not enable to handle several use cases as already mentionned.

To complete on your last point:

framework size: no real change (1 annotation),
implementation: a few dozen line of code,
rules: aligned on most used mappers,
learning for new comers become a breeze: have to admit I never saw anyone complaining of this feature or its complexity in jackson or other impls so why would it be for JSON-B? Strictly speaking it is way easier than adapters or serializers.

Don't get me wrong, I'm happy to not have it if you can still handle any payload and the use cases mentionned as smoothly as this proposal but all solutions with 1.0 are very verbose and likely fragile compared to that small standardization investment - we already have it in a serious number of JSON-B impl BTW, so it just enables to keep the code standard and not have to require any vendor specific API which is the goal of jakarta at the end ;).

jakartaee / jsonb-api

Any way to globally opt-out of JSONB's "Must-Ignore" policy? #56