taoensso / carmine

Redis client + message queue for Clojure
https://www.taoensso.com/carmine
Eclipse Public License 1.0
1.15k stars 131 forks source link

Breaking change in byte array serialization #83

Closed mishok13 closed 10 years ago

mishok13 commented 10 years ago

There was an incompatible change introduced between Carmine versions 2.4.6 and 2.6.0 (probably introduced in d07dd22c783022a0c85e7bc51a1bf4e19020a5f3). To illustrate, here's setting a key with Carmine 2.4.6:

> (require '[taoensso.carmine :as r])
nil
> (r/wcar {:pool nil :spec {:host "localhost"}}
          (r/set (byte-array (map byte (range 5))) 42))
nil

And Redis MONITOR shows this:

1398847371.194334 "SET" "\x00<\x00\x01\x02\x03\x04" "42"

When trying to fetch that same key from Carmine 2.6.0, I get nil:

> (require '[taoensso.carmine :as r])
nil
>  (r/wcar {:pool nil :spec {:host "localhost"}}
           (r/get (byte-array (map byte (range 5)))))
nil

And this is what Redis MONITOR shows:

1398847677.699256 "GET" "\x00>NPY\x01\x1ad\x18\x00\x00\x00\\\x00\x00\\x00<\\x00\x00\\x00\x01\\x00\x02\\x00\x03\\x00\x04"

There's no workaround for this issue, since trying to send the serialized key as a string is impossible:

> (r/wcar {:pool nil :spec {:host "localhost"}}
          (r/get "\u0000<\u0000\u0001\u0002\u0003\u0004"))
Exception Args can't begin with null terminator  taoensso.carmine.protocol/ensure-reserved-first-byte (protocol.clj:56)

I know it was fairly stupid to use byte arrays as keys in the first place, but since we ran into this situation, maybe there's a way out of it other than sticking to Carmine 2.4.6 until we fix all the places that use the database?

mishok13 commented 10 years ago

Oh, there's a workaround using taoensso.carmine.protocol/raw:

> (r/wcar {:pool nil :spec {:host "localhost"}}
          (r/get (taoensso.carmine.protocol/raw (byte-array (map byte (concat [0 60] (range 0 5)))))))
"42"

Seems a bit hackish but at least it's a start.

ptaoussanis commented 10 years ago

Hi Andrii,

Sorry was out all of today, didn't have a chance to look at this. Will check tomorrow and get back to you ASAP!

Thanks for the report.

ptaoussanis commented 10 years ago

Okay, following up:

I know it was fairly stupid to use byte arrays as keys in the first place

Not at all, that's a perfectly reasonable approach. You're running into trouble due to a small but serious bug I introduced in v2.6.0. Appreciate the detailed error report btw, that was helpful.

The correct behaviour (and behaviour prior to v2.6.0) is for binary args to be sent to Redis unmodified. v2.6.0, however, was serializing binary args (your key in this case), which was adding an unnecessary header.

I've just pushed v2.6.1 which restores the correct behaviour.

If you never wrote data to binary keys with v2.6.0

v2.6.1 should work out-the-box, though please confirm.

If you wrote data to binary keys with v2.6.0

In this case you'll have an issue since the keys will have been written incorrectly (with the unnecessary header). Here's a util that you can use to migrate/rewrite any faulty keys (please backup your data before use, and test after):

(require '[taoensso.encore  :as encore])
(require '[taoensso.carmine :as car])
(defn fix-carmine-v2-6-0-binary-keys [conn-opts binary-keys]
  {:pre [(vector? binary-keys)
         (every? encore/bytes? binary-keys)]}
  (doseq [k binary-keys]
    (let [unserialized-k (car/raw k)                ; Correct key
          serialized-k   (car/raw (nippy/freeze k)) ; Incorrect key
          [serialized-k-exists? unserialized-k-exists?]
          (car/wcar conn-opts (car/parse-bool
                                (car/exists serialized-k)
                                (car/exists unserialized-k)))]
      (when serialized-k-exists?
        (println)
        (println "Serialized key found:" (seq (:ba serialized-k)))
        (if-not unserialized-k-exists?
          (do
            (car/wcar conn-opts (car/rename serialized-k unserialized-k))
            (assert (not (wcar conn-opts (car/parse-bool (car/exists serialized-k)))))
            (assert      (wcar conn-opts (car/parse-bool (car/exists unserialized-k))))
            (println "Moved to:" (seq (:ba unserialized-k))))
          (do
            (println "*** WARNING ***")
            (println "Coudln't move," (seq (:ba unserialized-k))
              " already exists."))))))
  (println)
  (println "Finished!"))

(comment (fix-carmine-v2-6-0-binary-keys {}
           [(byte-array [(byte 1) (byte 2) (byte 3)])]))

Usage instructions

Call the above util with all binary keys that may have been written with Carmine v2.6.0. The utility will check each key to see if it may have been written incorrectly. When written incorrectly and when no other key would be overwritten, the util will move the binary key to the correct location.

See your console output for a summary of changes.


Does that help? Again, sorry for the trouble.

mishok13 commented 10 years ago

@ptaoussanis Whoa, that's a quick turnaround time. Thanks!

Fortunately, we didn't write keys with 2.6.0, thus the regression was fairly minimal. I've ran our tests with 2.6.1 and the issue is not reproducible anymore. Once again, thanks for the quick fix!

ptaoussanis commented 10 years ago

Okay, great. Closing this for now, but please feel free to reopen if you run into any further issues. Cheers :-)

danoyoung commented 10 years ago

Hello, I'm new to clojure, so I might be doing something bonehead, but I have hash-map that I'm trying to save to redis like this:

(defn index-into-redis [^clojure.lang.PersistentArrayMap d] "index into redis the guid as the hash key, and then the hh-id and individual-id (if defined" (cond (> (:individual-id d) 0) (wcar* (car/hmset (:guid d) "h" (:household-id d) "i" (:individual-id d))) :else (wcar* (car/hmset (:guid d) "h" (:household-id d)))))

with redis monitor I see:

"HMSET" "2014050713161675967810227973" "h" "\x00>NPY\x01\t -\x00\x00\x00\x04\x05\xf3\x89=" "HMSET" "2013090208002802711520731100" "h" "\x00>NPY\x01\t -\x00\x00\x00\x04\x05\xf3\x89\x19" "i" "\x00>NPY\x01\t -\x00\x00\x00\x04\t1n%" "HMSET" "2013090611333036575610322477" "h" "\x00>NPY\x01\t -\x00\x00\x00\x04\x05\xf3\x87\xcc" "HMSET" "2014010722111992654120600998" "h" "\x00>NPY\x01\t -\x00\x00\x00\x04\x05\xf3\x87\xcc" "HMSET" "2014021507385346247020706002" "h" "\x00>NPY\x01\t -\x00\x00\x00\x04\x05\xf3\x86\xac"

If I call pr-str on the map key, it all looks correct EXCEPT the key, ie.

(defn index-into-redis [^clojure.lang.PersistentArrayMap d] "index into redis the guid as the hash key, and then the hh-id and individual-id (if defined" (cond (> (:individual-id d) 0) (wcar* (car/hmset (pr-str (:guid d)) "h" (pr-str (:household-id d)) "i" (pr-str (:individual-id d)) )) :else (wcar* (car/hmset (pr-str (:guid d)) "h" (pr-str (:household-id d)))) ))

Also, if I do the above, I notice some of the keys are coming thru like: "HMSET" "\"2012011411523302578121206317\"" "h" "99848315" "i" "154236420" "HMSET" "\"2014010619212835348111016683\"" "h" "99848252" "HMSET" "\"2011013021242711818921323986\"" "h" "99848225" "i" "450870108"

The way that works is pr-str around the hash values and not the key, ie.

"HMSET" "2012011411523302578121206317" "h" "99848315" "i" "154236420" "HMSET" "2014010619212835348111016683" "h" "99848252" "HMSET" "2011013021242711818921323986" "h" "99848225" "i" "450870108"

This is what the cascalog returns:

{:guid "2012011411523302578121206317", :household-id 99848315, :individual-id 154236420} OK {:guid "2014010619212835348111016683", :household-id 99848252, :individual-id -1} OK {:guid "2011013021242711818921323986", :household-id 99848225, :individual-id 450870108} OK

I'm using v2.6.2 w/n cascalog and Redis version 2.8.9....

ptaoussanis commented 10 years ago

Hi Dan!

Sorry, could you clarify what your question is exactly - not quite sure what you're asking? :-)

Some random comments in the meantime in case they're helpful:

Otherwise just shout if you still have any questions - preferably under a new issue so we can keep the discussion going there :-)

Cheers!

danoyoung commented 10 years ago

@ptaoussanis thank you for the feedback and insight. I'm starting to understand this a bit more.