taoensso / carmine

Redis client + message queue for Clojure
https://www.taoensso.com/carmine
Eclipse Public License 1.0
1.15k stars 130 forks source link

Does Carmine support DragonflyDB? #304

Closed tom-adsfund closed 1 month ago

tom-adsfund commented 2 months ago

DragonflyDB is an efficient multicore alternative to Redis, which recently got vector search.

Presumably you can use this with Carmine, but maybe it could be documented?

ptaoussanis commented 2 months ago

Hi Tom, I've never tried Dragonfly with Carmine so don't know, sorry.

tom-adsfund commented 2 months ago

The mention of DragonflyDB was really just a motivation. Redis has also supported vector search for a long time. I'm just not sure how you use this basic Redis feature with Carmine.

ptaoussanis commented 2 months ago

Sorry, it wasn't clear what you were asking. Carmine can execute any commands against Redis that your version of Redis supports.

The wiki has info on how to call Redis commands from Carmine.

Hope that helps!

tom-adsfund commented 2 months ago

Thanks! Yeah, not ideal wording etc on my part.

I can see that redis-call can be used easily with strings, and so there's at least one method I could do using that. But is there a way I can use Clojure vectors? This is I guess the reason I'm asking: in a Clojure world it would be nice to directly supply a vector.

Obviously not a major issue at all to go the string route, but maybe there's something obvious from your side. If not, I'm happy for this to be closed.

ptaoussanis commented 2 months ago

is there a way I can use Clojure vectors? This is I guess the reason I'm asking: in a Clojure world it would be nice to directly supply a vector.

I don't know what you mean by "use" or "supply" Clojure vectors, sorry.

It would help if you could link to the docs for the particular command you want to run, and show an example of the input argument/s you'd like to use, and what result you'd like to see.

tom-adsfund commented 2 months ago

https://redis.io/docs/latest/develop/interact/search-and-query/advanced-concepts/vectors/

On this page, they use:

np_vector = np.random.rand(dim).astype(np.float32)
redis_conn.hset('key', mapping = {vector_field: np_vector.tobytes()})

And so in Clojure you might expect something like:

(carmine/hset "key" [1.0 2.0 3.0])

The use of the vector index is more involved, also on that page.

Presumably you could have a vhset, to disambiguate away from the helpful Nippy support.

Ideally you could then have a helper function for similarity search.

Something like:

(carmine/vhsimilarity [1.1 2.2 3.3] "index" 5)

Or whatever is realistic (supporting parameters).

ptaoussanis commented 2 months ago

I'm not familiar with this particular API, and you've linked here to a multi-page document which I'm not going to have the opportunity to read in detail right now.

On this page, they use:

This appears to be Python? Again, this isn't something I can easily parse without spending 15 minutes trying to grok unfamiliar documentation with an unfamiliar client.

I presume the client's ultimately just issuing commands over the standard Redis client protocol, which means it's just a sequence of string arguments that should in principle be similar/identical via Carmine's redis-call.

Have you tried playing with these commands via redis-call?

(carmine/hset "key" [1.0 2.0 3.0])

It's not clear exactly what you're trying to achieve here though? In what way would the syntax of vhset be preferable to just calling (carmine/hset "key" 1.0 2.0 3.0), etc.?

tom-adsfund commented 2 months ago

So the Python is taking a float32 type array, and getting the bytes, and using the client to manage whatever conversion is necessary (I don't know what this is, but see below*) to put it as an argument to HSET.

The reason is that the index expects a vector of n float values to be in this format in a HSET entry.

(*) The example on the page of using the index uses a byte encoding like "\x00".

I'm not familiar with the whole problem enough either to make more progress. I'm coming from this more as someone wanting to quickly store vectors and retrieve the top-k similar. So hopefully this explains the API I'm requesting.

zerg000000 commented 2 months ago

This is unrelated to the topic of the issue. If you want to ask questions you might be better go to Clojure Slack.

In case you still need it,

(defn floats->bytes [^floats xs]
  (let [buffer (ByteBuffer/allocate (* (alength xs) Float/BYTES))]
    (areduce xs i ret buffer
             (.putFloat ret (aget xs i)))
    (.array buffer)))

(car/wcar {} (car/hset "key" "vector_field" 
                       (car/raw
                        (floats->bytes
                         (float-array [1.0 2.0 3.0])))))

(car/wcar {} (car/hget "key" "vector_field"))
tom-adsfund commented 2 months ago

Thanks! That's basically an implementation of vhset. Have you tried this in practice? Are the float types the same?