metosin / jsonista

Clojure library for fast JSON encoding and decoding.
https://cljdoc.org/d/metosin/jsonista
Eclipse Public License 2.0
422 stars 30 forks source link

Custom decoder #35

Closed jeaye closed 3 years ago

jeaye commented 4 years ago

I'm currently replacing transit+json with jsonista for some performance-critical areas of a Clojure server. The only hitch has been supporting a lossless round trip for keyword values. By default, with jsonista, we get this:

user=> (-> (j/write-value-as-string {:system/status :status/good} json-mapper) (j/read-value json-mapper))
#:system{:status "status/good"}

But this is all JSON used internally, so I can take a bit of liberty with using a custom encoding for keywords. For example, I've been benchmarking with this (strcat is like str, but from stringer.core):

(jsonista/object-mapper {:encode-key-fn true
                         :decode-key-fn true
                         :encoders {clojure.lang.Keyword (fn [^clojure.lang.Keyword kw
                                                              ^com.fasterxml.jackson.core.JsonGenerator gen]
                                                           (.writeRawValue gen (strcat "[\"!kw\",\"" kw "\"]")))}})

This works well for encoding, but jsonista doesn't have a way to specify custom decoders. What I'd really love is a custom decoder for JSON arrays so I can detect this custom ["!kw", "status/good"] scenario and generate the correct keyword from it.

I have tried to solve this after the fact, using specter. It's fast, but the optimized specter transformation to replace those specific vectors doubles the benchmark time for an encode/decode round trip. So I think it would be much more efficient to do this as the JSON is being decoded, rather than after.

So, is this something you guys think can work with jsonista? Support for a custom decoder to turn ["!kw", "status/good"] into :status/good?

Thanks for your time and for the awesome library.

Side note: By using jsonista + the custom encoder, criterium says my encoding speed has improved 10x, compared to transit+json. But, due to the second pass needed for specter to correct the keywords, the decoding speed has slowed by 50%, compared to transit+json. If I can get the decoding speeds back to at least as good as transit+json, this will be a huge win.

ikitommi commented 4 years ago

I think all you need to do is to rebind the list-decoder to your own version, that peeks the first value and look for a suitable parser for the tag, else use the default decoder. See https://github.com/metosin/jsonista/blob/master/src/clj/jsonista/core.clj#L79

ikitommi commented 4 years ago

But, if you know your data model in advance, e.g. you have a Malli definition, you could use it to compile optimized transformers without any tagging. Will try to cook a working example out of this.

jeaye commented 4 years ago

Thanks so much for the quick and helpful reply. :)

I think all you need to do is to rebind the list-decoder to your own version, that peeks the first value and look for a suitable parser for the tag, else use the default decoder. See https://github.com/metosin/jsonista/blob/master/src/clj/jsonista/core.clj#L79

Ok, got it. I've been through that source and also the PersistentVectorDeserializer source. My only remaining question is this:

Can this be done as a module I can pass in through :modules, to overwrite the clojure-module deserializer for List, or does this need to be done through a fork?

ikitommi commented 4 years ago

I believe you can do this using a module. If there is something needed in jsonista, please do a PR. Also, new TaggedValueOrPersistentVectorDeserializer PR welcome :)