weavejester / medley

A lightweight library of useful Clojure functions
Eclipse Public License 1.0
865 stars 66 forks source link

Proposal: re-find-groups-seq #84

Open hlship opened 9 months ago

hlship commented 9 months ago

For Advent of Code I found I needed a function I'm calling re-find-groups-seq:

(re-find-groups-seq #"(\d+) (red|green|blue)"
                    "5 red, 2 green, 1 blue")
=>
([{:match "5", :within "5 red", :start 0, :end 1} {:match "red", :within "5 red", :start 2, :end 5}]
 [{:match "2", :within "2 green", :start 7, :end 8} {:match "green", :within "2 green", :start 9, :end 14}]
 [{:match "1", :within "1 blue", :start 16, :end 17} {:match "blue", :within "1 blue", :start 18, :end 22}])

Essentially, it's re-seq, but on each match, you get a vector of maps, not strings; the maps identify where in the input string the pattern group matched.

So if there's interest, I'm quite happy to add docs and tests and submit as a PR.

weavejester commented 9 months ago

Thanks for the proposal. So if I understand this correctly, this provides the information you'd get from re-seq, but with the start and end positions of each match?

hlship commented 9 months ago

Yes, that's it. It's like re-seq, but instead of internally calling re-groups (returning a string or vector of strings) for each match, we return a map (or vector of maps) that identifies the matched string, and the input string with start and end points.

In Advent of Code there was a problem where you were parsing digits from an input line and needed to know where on the line the matched digits were. This utility made that easy, and let the regular expression executor do the ugly work (such as keeping track of the start position of the match).

weavejester commented 9 months ago

Could you give some code demonstrating this use-case, and suggest any other use-cases where this functionality might be useful?