taoensso / faraday

Amazon DynamoDB client for Clojure
https://www.taoensso.com/faraday
Eclipse Public License 1.0
238 stars 84 forks source link

Problems with LazySeqs #99

Closed f-f closed 4 years ago

f-f commented 7 years ago

When an empty sequence is put into Dynamo, an exception is thrown: Exception Unknown DynamoDB value type: class clojure.lang.LazySeq. Seefreezefor serialization. taoensso.faraday/clj-val->db-val (faraday.clj:276)

However, when debugging it would be useful to have some more detail on which value it is - e.g. when working with nested maps/vectors.

ptaoussanis commented 7 years ago

PR welcome if you'd like me to consider a specific implementation, thanks.

f-f commented 7 years ago

I'm still trying to figure out exactly when the lazyseqs get refused. PR coming as soon as I understand what's going on :)

f-f commented 7 years ago

Ok apparently the lazyseqs get refused even after realizing, e.g. (doall (map somefunc mycoll) doesn't work. @ptaoussanis what would be your take on casting the lists into vectors before feeding them to Dynamo?

ptaoussanis commented 7 years ago

Hi there, I'm sorry- don't really understand from the details provided what you're actually trying to do here.

f-f commented 7 years ago

Oh sorry, my fault for not explaining in detail what's happening. Background is that I'm porting a codebase from Mongo to Dynamo. Objects thrown into the db are maps that have sequences as values to some keys. Those seqs could be vectors, lists, lazyseqs. Now, in monger everything was fine, as apparently the db wrapper was doing all the work of magically casting the seqs. Here on Faraday if I put a vector or a list everything is fine, but if it's a lazysequence (even a realized one, e.g. (doall (map ...) Faraday panics, so as a bandaid I'm wrapping everything in a (into [] myseq) and it works, but it's not nice. Additionally, I noticed that in the case the sequence I map over it's empty, a (doall (map f emptyseq)) still returns a lazyseq, is it a Clojure bug?

Hope this gives you a bit of context about my question "why not realize every lazyseq that comes in before putting it in Dynamo?"

ricardojmendez commented 7 years ago

@ff- Can you provide a small example project?

f-f commented 7 years ago

@ricardojmendez yep, here follows a minimal working example, where Faraday complains about the LazySeq in mylist:

(ns myns.db
  (:require [taoensso.faraday :as far])
  (:gen-class))

(def opts {:access-key "your-access-key"
           :secret-key "your-secret-key"
           :endpoint   "http://localhost:8000"})

(far/ensure-table opts :mytable [:id :s]
                  {:throughput {:read 1 :write 1}
                   :block? true})

(far/update-item opts :mytable
                 {:id "test"}
                 {:update-expr "SET mylist = :mylist"
                 :expr-attr-vals 
                  {":mylist" (->> (range 10)
                                  (map identity)
                                  ;(into []) 
                                  ;; ^ if you uncomment this it works fine
                                  (doall))}
                 :return :all-new})
f-f commented 7 years ago

@ricardojmendez I updated the example above to be more self-contained ^

singen commented 7 years ago

I've run into this issue with faraday as well, many times. I think it would make sense for faraday to coerce the LazySeq type into something that the Dynamodb client understands.

f-f commented 7 years ago

@singen I thought a lot about this, and I went as far as writing a little wrapper over Faraday that takes care of this and other things, and switched to using mapv by default over map in my programming (this alone fixed all my pain with lazyseqs).

Now, IMHO it would be useful to have this "coercing" behavior in Faraday by default - to avoid running into this issue and spend hours hunting down the bug if the data structure is complex enough like it was in my case - but the easy objection is "what if we have an infinite LazySeq?", and I don't have an answer for it yet.

singen commented 7 years ago

In the case of an infinite LazySeq, we would have to cut it to some maximum length anyway. I'm pretty sure Faraday could throw an exception in case the maximum length is exceeded.

ptaoussanis commented 7 years ago

Thanks for the example @f-f. Sorry for the delay replying.

From the added context, would like to suggest that there's no need for you to be reaching for laziness here and that it'd be better (for a few reasons including performance and intention/readability) to avoid it altogether.

(mapv identity (range 10)) or equivalent would be more efficient and (notably) less complex.

Re: the general question of adding laziness support to Faraday-

Think this'd be a mistake since:

I.e. would suggest that using laziness when one doesn't specifically need/want laziness is an anti-pattern. I will note that it's unfortunately still a fairly common pattern in Clojure due to historical reasons (lack of good non-lazy alternatives in original core API, and some older books encouraging laziness by default).

ptaoussanis commented 7 years ago

To clarify: would still be up for seeing PRs that help identify accidental laziness, etc. for better error reporting.

kipz commented 4 years ago

So let's keep this open, update the readme and improve the error message.