redplanetlabs / specter

Clojure(Script)'s missing piece
Apache License 2.0
2.51k stars 102 forks source link

remove first matching element from a vector #300

Closed xificurC closed 3 years ago

xificurC commented 3 years ago

I found this strange behavior today. The question was "given this vector [...] how to remove the first occurrence of 2?".

(def v [8 2 5 2 9])
(require '[com.rpl.specter :as $])
($/setval [($/filterer ($/pred= 2)) $/FIRST] 'X v) ; indeed, [8 X 5 2 9]
($/setval [($/filterer ($/pred= 2)) $/FIRST] $/NONE v) ; oh? [8 2 5 9]

From some docstrings i guess what is happening is that filterer removes NONEs from the filtered seq and reattaches the rest into their places, leaving a hole in the last position. Not very intuitive!

What would be a correct solution?

xificurC commented 3 years ago

This is the best I managed so far

($/transform [($/subselect $/ALL ($/pred= 2))] #(cons $/NONE (rest %)) v)
xificurC commented 3 years ago

Ah, filterer is like subselect with ALL, so this returns the same

($/transform [($/filterer ($/pred= 2))] #(cons $/NONE (rest %)) v)

The tricky thing is that calling FIRST and setting that to NONE does something else. I guess the filterer step doesn't see the NONE anymore, it sees a shortened sequence and extends it at the end, deleting the last element.

xificurC commented 3 years ago

I don't see an easier way, feel free to give one though. From my limited knowledge I would implement $/first

($/setval ($/first #(= 2 %)) $/NONE v)

Maybe the parameter(s) could be a path as well? :)

xificurC commented 3 years ago

The fact that setval 0 and setval NONE behave differently on [(filterer (pred= 2)) FIRST] feels like a bug. The problem is that FIRST eats up the NONE whereas we want filterer to intrepret the NONE.

xificurC commented 3 years ago

Indeed it is FIRST eating the NONE, here is another variant that doesn't interpret NONE:

($/defnav NFIRST []
        (select* [this structure next-fn] (next-fn (first structure)))
        (transform* [this structure next-fn]
                    (let [rt (next-fn (first structure))]
                      (into (conj (empty structure) rt) (rest structure)))))

With this the example works:

($/setval [($/filterer ($/pred= 2)) NFIRST] $/NONE v)

Still, the path definition isn't as declarative as one would like. For that a first or nth that takes a function/path would be required.

nathanmarz commented 3 years ago

This is just how the navigators work. The way subselect / filterer work when transformed to a smaller sequence is really the only way that makes sense. Your NFIRST navigator is a fine way to get the extra control. You could also have a non-removing analogue to nthpath to accomplish the same thing.

Closing because not a bug.

xificurC commented 3 years ago

Hi @nathanmarz , thanks for the quick response!

Would a set of functions like first, nth, and last interest you in form of a PR? Here is a quick impl for first. For simplification I only considered vectors in transform for a POC. I also wasn't sure what would be correct behavior when nothing is found, so I just left the structure untouched.

($/defdynamicnav first [& path]
  ($/late-bound-nav [late ($/late-path path)]
    (select* [this structure next-fn]
      (if-some [v (reduce (fn [_ nx] (when ($/select-first late nx) (reduced nx))) nil structure)]
        (next-fn v) $/NONE))
    (transform* [this structure next-fn]
      (let [idx (reduce (fn [i nx] (if ($/select-first late nx) (reduced i) (unchecked-inc i))) 0 structure)]
        (if (= idx (count structure))
          structure
          (assoc structure idx (next-fn (structure idx))))))))
nathanmarz commented 3 years ago

I would accept something like first-match that takes in an arbitrary path and only selects/transforms the first navigated element. However, I wouldn't accept something like this just for sequences as it's too specific.

xificurC commented 3 years ago

I was looking for something like that in the wiki and didn't find anything. After thinking about it I thought it's missing because it's not possible to build such a navigator. Can you give a hint on how it would be implemented?

To be clear, you mean this would work:

($/setval ($/first-match $/ALL ($/pred= 2)) $/NONE v)
nathanmarz commented 3 years ago

Yea, so (setval (first-match ALL even?) NONE [1 3 4 6 8 9]) would return [1 3 6 8 9]. I think it could be built doing a transform with the subpath that uses a volatile flag to determine whether to apply the update function or return the value unchanged.

It would still end up doing a full traversal, so performance-wise it would be far from optimal. Specter's abstractions as-is would need to be extended for transform to be able to terminate early.

xificurC commented 3 years ago

Yeah I just whipped up this

(defn one []
  (let [s (volatile! true), t (volatile! true)]
    ($/nav []
     (select* [this structure next-fn] (if @s (do (vreset! s false) (next-fn structure)) $/NONE))
     (transform* [this structure next-fn] (if @t (do (vreset! t false) (next-fn structure)) structure)))))
xificurC commented 3 years ago

Just to be clear, (setval! [ALL even? (one)] ...) works with this.

I'm not sure what you mean by it being sub-optimal though

xificurC commented 3 years ago

I'm not sure why I used 2 volatiles :) Anyway, is this what you meant?

nathanmarz commented 3 years ago

That's an even better way to do it. If you open up a PR with some tests we can iterate on it.

By sub-optimal performance I mean the path will continue to traverse the entire path even after it's matched the first one.

xificurC commented 3 years ago

I understand what you mean by suboptimal now and that's definitely at least documentation noteworthy. I just realized something more problematic:

(def p [$/ALL ($/pred= 2) (one)])
(let [v [2 5 2 8 9]] [($/select p v) ($/select p v)])

This returns [[2] []] on the first run and [[] []] on subsequent ones. Not exactly transparent! Since there's no completion call it's impossible to correctly clean up the volatile. So the only viable alternative is the one I posted, which is a one-shot navigator, which I dislike.

nathanmarz commented 3 years ago

Ah, that's because the inline compiler is too aggressive about evaluating and caching navigator forms with known arguments. We have that fixed in our own private fork of Specter, but now it seems like a good idea to fix that in the open-source version as well. I'll accept a patch for that as well.

xificurC commented 3 years ago

How can that be fixed? The only way I see it is by reinterpreting the entire path, which will be suboptimal. I guess if I could mark the path as non-cachable? Something like ($/fresh-nav (one)) where fresh-nav returns a new navigator on each select/transform call?

xificurC commented 3 years ago

I'll accept a patch for that as well.

For a moment I thought you'd merge your changes back into the open source version :)

nathanmarz commented 3 years ago

Should just be a matter of removing the all-static? check in magic-precompilation* that eagerly resolves the form with an apply.

xificurC commented 3 years ago

Sorry, went on to do other things. I tried the navigator after removing the all-static? check and the result is the same. After printing some debug information it seems magic-precompilation* doesn't enter that path at all, it goes the VarUse path in this case..

xificurC commented 2 years ago

Bumping this. Also, any chance of open sourcing some of your specter improvements?

nathanmarz commented 2 years ago

Our internal fork has diverged too much to practically merge changes back. Happy to accept a PR that solves this from scratch though.

xificurC commented 2 years ago

You can release it as a new lib then?

I'll gladly patch this if you guide me, your last tip was a dead end.