FasterXML / jackson-datatypes-collections

Jackson project that contains various collection-oriented datatype libraries: Eclipse Collections, Guava, HPPC, PCollections
Apache License 2.0
79 stars 53 forks source link

`Cache` deserialization fails with NPE for `null` valued entries #140

Closed cowtowncoder closed 11 months ago

cowtowncoder commented 11 months ago

Since most Guava containers are null-averse, looks like deserializing Cache with null values fails with NPE.

Longer term it may make sense to add more options for handling wrt Guava module, but for now:

  1. Users can use @JsonSetter annotation on field (or equivalent Config Overrides) to "skip nulls"
  2. We should throw JsonMappingException (specifically, MismatchedInputException) instead of NPE in failure case
arthurscchan commented 11 months ago

@cowtowncoder You are acting fast. I just received a new issue earlier today from OSS-Fuzz (https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=65183), which is exactly an NPE from GuavaCacheDeserializer. Since you have already fixed that. I would wait for this issue to be resolved.

cowtowncoder commented 11 months ago

@arthurscchan Yeah in this case it was just so similar to other Guava NPEs, easy to reproduce.

Also wondering if this (unrelated) Fuzz issue:

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=65181

might have been fixed by one of your PRs. At least I seem unable to reproduce it.

ben-manes commented 11 months ago

I'm not sure if this matters but fyi wrt the contract for native serialization of Cache.

 * <p>The caches produced by {@code CacheBuilder} are serializable, and the deserialized caches
 * retain all the configuration properties of the original cache. Note that the serialized form does
 * <i>not</i> include cache contents, but only configuration.
arthurscchan commented 11 months ago

@arthurscchan Yeah in this case it was just so similar to other Guava NPEs, easy to reproduce.

Also wondering if this (unrelated) Fuzz issue:

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=65181

might have been fixed by one of your PRs. At least I seem unable to reproduce it. @cowtowncoder Yes. That should be fixed and will need some time to resolve.

cowtowncoder commented 11 months ago

@ben-manes Good point. I don't know if configuration may be retained in JSON serialization, but the point about dropping contents is ... interesting. Unfortunately implementation was for #90 where content serialization was explicitly desired. So change at this point would probably need configuration for opt-in.

So if anyone would want "drop the contents" on serialization, they should file a new issue as RFE.

ben-manes commented 11 months ago

I think it was for FlumeJava (Spark, Apache Beam predecessor) so the computations could be sent to the data, caching as it was processed on the local node. A very different usage and since a cache is transient data, perhaps the only reasonable expectation. Just an fyi, this seems fine for Jackson

cowtowncoder commented 11 months ago

@ben-manes Yeah it is what it is. But conceptually I think it was wrong thing to do -- I wish I had thought it through. Not worth worrying too much about but just one of those live-and-learn cases.