taoensso / nippy

The fastest serialization library for Clojure
https://www.taoensso.com/nippy
Eclipse Public License 1.0
1.05k stars 60 forks source link

Question on *freez-fallback* and self referecing object #181

Open jpmonettas opened 2 weeks ago

jpmonettas commented 2 weeks ago

Hi Peter! Thanks for your amazing tooling and documentation!

I was trying Nippy freeze on a self referencing object like this :

$ clj
Clojure 1.12.0
user=> (add-lib 'com.taoensso/nippy {:mvn/version "3.5.0-RC1"})
user=> (require '[taoensso.nippy :as nippy])
user=> (defprotocol PSetGetRef
                 (set-ref [_ v])
                 (get-ref [_]))

user=> (deftype SelfRef [^:unsynchronized-mutable o]
                 PSetGetRef
                 (set-ref [_ v] (set! o v))
                 (get-ref [_] o))

user=> (def sr (SelfRef. nil))
user=> (set-ref sr sr)
user=> (binding [nippy/*freeze-fallback* :write-unfreezable]
                (nippy/freeze sr))
Execution error (NoSuchFieldException) at java.lang.Class/getField (Class.java:2117).
o

What is the correct way of handling this kind of unfreezable objects?

ptaoussanis commented 2 weeks ago

Hi Juan, you're very welcome- thanks for saying so!

It looks like the problem in this case can be reduced to a simpler example:

(deftype MyType1 [o])
(thaw (freeze (MyType1. nil))) ; Succeeds

(deftype MyType2 [^:unsynchronized-mutable o])
(thaw (freeze (MyType2. nil))) ; Fails (NoSuchFieldException "o")

I.e. it looks like the current freeze implementation for ITypes doesn't support ^:unsynchronized-mutable fields.

Will dig a little further now...

jpmonettas commented 2 weeks ago

Nice, thanks!

I'm experimenting with Nippy and trying to save/restore FlowStorm recordings into files, which contains pointers to whatever the user has recorded. So I'm trying to see how it behaves with different kinds of objects and how to catch things that can't be serialized.

ptaoussanis commented 2 weeks ago

So on first inspection, it looks like ^:unsynchronized-mutable causes deftype fields to be stored separately from standard fields.

The problem seems to already be apparent here:

(.-o (MyType1. :foo)) ; => :foo
(.-o (MyType2. :foo)) ; => Throws (No matching field found)

This starts going pretty deeply into the bowels of Clojure implementation details, but my first guess from the fact that .-o access fails is that these fields might actually not be intended to be readable from outside the deftype methods?

jpmonettas commented 2 weeks ago

Yeah, you are right, I think here is the code that makes mutables private https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Compiler.java#L4875-L4877

Also if you decompile both you get for the immutable one :

(deftype T [o])
public final class T implements IType
{
    public final Object o;

    ...
}

while the one with mutable field gives you :

(deftype TM [^:unsynchronized-mutable o])
public final class TT implements IType
{
    Object o;

    ...
}
ptaoussanis commented 2 weeks ago

Ah, nice find 👍 I poked around trying to look for the relevant code but wasn't successful.

So it basically looks like you'll need to export the internal unsychronized state if you want to serialize it. Hopefully that's not a problem?

Separately - I'm not sure if your initial question re: freezing self-referencing objects is still independently relevant? I don't know how something like that might behave (or what an example would look like)- so would basically need to try it and see what happens ^^

If you do have another example that fails, please feel free to share and I'll happily take a look.

jpmonettas commented 2 weeks ago

Sure, I'm just trying to see what kinds of objects are serializable by Nippy and how freeze behaves in different cases were they are not. Trying to figure out what I can do in those cases. Since I'll be serializing random objects not under my control it needs to handle every case. The self referencing thing was just an example of objects graphs with cycles.

jpmonettas commented 2 weeks ago

Also tried this self referencing object and seams to be working fine, but not sure what this content is about 🤔 :

(def a (ArrayList.))
(.add a a)

(nippy/thaw (nippy/freeze a))

#:nippy{:unthawable
        {:type :serializable,
         :cause :quarantined,
         :class-name "java.util.ArrayList",
         :content
         [-84, -19, 0, 5, 115, 114, 0, 19, 106, 97, 118, 97, 46, 117,
          116, 105, 108, 46, 65, 114, 114, 97, 121, 76, 105, 115,
          116, 120, -127, -46, 29, -103, -57, 97, -99, 3, 0, 1, 73,
          0, 4, 115, 105, 122, 101, 120, 112, 0, 0, 0, ...]}}

(-> (nippy/thaw (nippy/freeze a)) :nippy/unthawable :content count)
63

Also interesting these seams to be serializable with Java serialization system :

(def a (ArrayList.))
(.add a a)

(def java-ser-file "./java-ser-bytes")
  (with-open [oos (ObjectOutputStream. (FileOutputStream. java-ser-file))]
    (.writeObject oos a) 
    (.flush oos))

(def data-in (with-open [ois (ObjectInputStream. (FileInputStream. java-ser-file))]
                         (.readObject ois)))

(type data-in) ;; => java.util.ArrayList
(type (first data-in)) ;; => java.util.ArrayList