Closed tomdl89 closed 1 year ago
I gave it the name partition-upto
for consistency with take-upto
and drop-upto
but I think partition-after
or somesuch may make more sense.
I think this seems reasonable. I seem to recall I've run into a use-case for this, myself. I do think that partition-after
might be more accurate.
I'd also like to get some specific use cases mentioned in this PR as well. Some examples of when this would be useful would help justify it.
Cool, I'll take a look at the comment(s) and listing some use cases. I did have a thought that partition-after
might (rightfully) imply the need for a partition-before
which would use take-while
to generate run
in the lazy form. Not sure what you think about having both?
Having both is reasonable as long as there's no easy existing way to achieve either. I don't think there is, but I'd like to double check to ensure we're not reimplementing something that can be done conventionally.
I've changed the name and fixed the redundant let binding. lmk if you think it needs anything else to be merged 🙏.
Sorry for the noise. I've added partition-before
because I don't think it can be easily done conventionally. Ready to review.
In response to
I'd also like to get some specific use cases mentioned in this PR as well.
I grepped for questions on the clojurians slack. There's plenty:
For example, let’s say I have the sequence: '("BEGIN" "a" "b" "c" "d" "END" "BEGIN" "foo" "bar" "END" "BEGIN" "c" "l" "j" "cool" "END") And I would, ideally, group this into a sequence of three sequences: '(("BEGIN" "a" "b" "c" "d" "END") ("BEGIN" "foo" "bar" "END") ("BEGIN" "c" "l" "j" "cool" "END"))
Would be solved by either (partition-after #{"END"} coll)
or (partition-before #{"BEGIN"} coll)
Dumb question but what would be the Clojure way for transforming a list of words into a list of sentences? E.g. [ “Hello.” “How” “are” “you?” “I” “am” “fine.” ] => [ “Hello.” “How are you?” “I am fine.“] I’m thinking it’ll involve a reduce.
Would be solved by (map #(clojure.string/join " " %) (m/partition-after #(re-find #"[.?!]$" %) coll))
I’ve tried partition-by but groups items together. I’ve a collection of has-map ordered and I’ve to split them when a value of an item is null... I was looking more for
[{:k 1} {:k 2} {:k nil} {:k 3} {:k 4}] =>(({:k 1} {:k 2}) ({:k nil} {:k 3} {:k 4}))
something like(split-when f coll)
Would be solved by (m/partition-before #(= [:k nil] (find % :k)) coll)
Is there a way to do something like partition-by but where it creates a new group whenever the value is true, rather than when it changes? I've written a reduce version, but I assume I'm missing a trick with partition-by or another function
(->> [1 3 2 4 1 2 3] (reduce (fn [acc n] (if (odd? n) (conj acc [n]) (update acc (dec (count acc)) #((fnil conj []) % n)))) [])) ;; => [[1] [3 2 4] [1 2] [3]]
Would be solved by (m/partition-before odd? coll)
Trying to figure out a little data transformation puzzle: • I have a list of strings with a start 0 and end 1 marker • Sometimes I get them split up by the API I’m working with, e.g. ["0a1" "0b1" "0c" "c1" "0d1"] • And I want to transform this list so that I get ["0a1" "0b1" "0cc1" "0d1"]
Would be solved by (map clojure.string/join (m/partition-before #(re-find #"^0" %) coll))
I hope these examples are sufficient. There's plenty more but I think they make a decent enough case. FWIW, most answers either suggested using spec, or did an explicitly stateful partition-by
(by using an atom to track need to split).
Thank you for researching up those real-world examples. I think they more than justify the addition of these two functions.
Thanks for the updates! Does inlining the functions lead to a measurable performance difference when benchmarked with something like Criterium?
At the time, I quickly tested with time
and noticed a (probably statistically insignificant) benefit, but having just tested with criterium, the non-inlined version is consistently (but still, probably insignificantly) faster. It's a bit confusing, but anyway, I've removed the inlining commit.
The JVM's JIT compiler can make performance difficult to predict. What may be happening in this case is that Clojure's inlining may be interfering with the JVM's attempts to optimize, so once the JVM is warmed up its ever so slightly slower. This is why I'm often cautious about performance optimizations done in absence of benchmarking - I've been too often bitten by this in the past.
Could you squash your commits down into one? Then I can merge.
That's all squashed now. Thanks for all of the really helpful feedback.
Anything outstanding I can help with?
Thank you for your work on this, and apologies for the delay. This all looks good, so I'll merge it in and look into cutting a new release.
Any plan to announce on clojurians slack #announce ? I can if not.
You can if you want, I wasn't going to as it didn't seem like a major enough change, but maybe I'm being too cautious.
Every few months I see a question on the clojurians slack about how to partition a collection by splitting after an element matches a condition. All solutions I see are either less performant than the solution in this PR or much less readable. This is also something I have written in some form many times before, and I've seen in a couple of code bases I've worked in.
It seems to complement the
take-upto
anddrop-upto
functions nicely too. I mostly followed the source code for these, andpartition-by
in clojure.core, but used volatile vectors to store the partitions in the transducer form rather thanjava.util.ArrayList
so as to make it more cross-platform friendly. All tests pass.