AppsFlyer / pronto

Clojure support for protocol buffers
110 stars 8 forks source link

`pcond->` issue with unknown keywords in `assoc-in` #12

Closed evg-tso closed 2 years ago

evg-tso commented 2 years ago

Hi,

When using pcond->, and the form expression contains an unknown keyword, the macro fails even if the conditional expression is evaluated as false.

  ; works
  (pronto/pcond-> (pronto/proto-map my-mapper
                                    PersonOuterClass$Person
                                    :name "Name")
                  (some? nil) (assoc-in [:address :city] "Tel-Aviv"))

  ; don't work
  (pronto/pcond-> (pronto/proto-map my-mapper
                                    PersonOuterClass$Person
                                    :name "Name")
                  (some? nil) (assoc-in [:b :city] "Tel-Aviv"))

Link to a minimal git repository that reproduces the issue

nenorbot commented 2 years ago

Thanks for the detailed issue :)

You've hit an undocumented quirk of pcond-> -- I'll try to explain what's happening before discussing what we can do.

To explain, I'd like to start with p-> because it's simpler and pcond-> works similarly.

When you write:

(p-> person
       (assoc-in [:address :city] "Tel Aviv")
       (assoc-in [:address :street] "whatever")
       (assoc-in [:address :house_num] 11))

what actually gets expanded is something like this:

(let [address (transient (:address person))]
   ;; now we have a mutable address proto-map that's cheap to mutate
   (assoc! address :city "Tel Aviv")
   (assoc! address :street "whatever")
   (assoc! address :house_num 11)
   (assoc person :address (persistent! address)))

So what we've done is use a single transient address and pipeline (hence the p in p-> :slightly_smiling_face:) all assoc operations on it.

With pcond->, a similar transformation happens, it generates a read-modify-write on a transient. Going back to your example:

(pronto/pcond-> (pronto/proto-map my-mapper
                                    PersonOuterClass$Person
                                    :name "Name")
                (some? nil) (assoc-in [:b :city] "Tel-Aviv"))

becomes:

(let [b (transient (:b person))]
   (when (some? nil)
      (assoc! b :city "Tel Aviv"))
   (assoc person :b (persistent! b)))

So even though the operation is never applied, we still try to read :b at first, at which point this fails because you cannot read a key not in the schema from a proto-map, by design.

The take-away is that p*-> macros generate read-modify-write code whose structure is taken from the keys used by assoc/assoc-in/etc operations that are requested. In the case of pcond-> this becomes problematic if one of these keys is wrong because it leads to surprising behavior when the predicate is falsey as in your example.

For now, I'm leaning towards keeping this as a known limitation, because I'm not happy with any of the solutions that come to mind at the moment:

  1. We could "lift up" the predicates:

    (when (some? nil)
    (let [b (transient (:b person))]
    (assoc! b :city "Tel Aviv"))
    (assoc person :b (persistent! b)))

    This fixes this particular issue, but misses the point of pipelining in pcond-> making it pretty pointless.

  2. Use something like a delay to defer reading the field only if really needed (if the predicate was truthy):

    (let [b (delay (transient (:b person)))]
    (when (some? nil)
      (assoc! @b :city "Tel Aviv"))
    (when (realized? b)
      (assoc person :b (persistent! @b))))

    This approach might work but adds complexity for what is a narrow use-case. It will also adds allocations and volatile reads, which I'd prefer to avoid introducing into these macros, if possible.

-- neither options is great. I'll give it some thought and see if I can come up with any alternatives. But otherwise this might become a known-issue (or perhaps label it as a feature that ensures your code is schema-compliant?? :wink:).

If you have any input, I'd love to hear.

evg-tso commented 2 years ago

Thanks for the detailed response!

I don't have additional input, I tend to agree with your suggestion. I agree that documenting this behavior in any way is good practice (either as a feature or a known issue).