weavejester / medley

A lightweight library of useful Clojure functions
Eclipse Public License 1.0
870 stars 67 forks source link

Suggested addition: index-and-map-by #50

Open vvvvalvalval opened 4 years ago

vvvvalvalval commented 4 years ago

Wondering if you'd be open to a function index-and-map-by generalizing over index-by like the following:

(defn index-and-map-by
  [kf vf coll]
  (into {} 
    (map 
      (fn [x]
        [(kf x) (vf x)]))
    coll))

(Of course, you might want to improve the peformance of this implementation, e.g with reduce and transients).

(index-and-map-by kf vf coll) would be equivalent to (->> coll (index-by kf) (map-vals vf))

I'm asking because I keep defining this index-and-map-by function in my projects.

weavejester commented 4 years ago

Why not just use (->> coll (index-by kf) (map-vals vf))? It's not particularly more verbose, and it makes it clear which function is the indexer and which function maps the values. Or alternatively, (->> coll (map vf) (index-by kf)) would work as well, no? And that's only four characters longer.

vvvvalvalval commented 4 years ago

Why not just use (->> coll (index-by kf) (map-vals vf))? It's not particularly more verbose, and it makes it clear which function is the indexer and which function maps the values.

What can I say, the main reason is frequency :) . This pattern occurs so often in my code that I will eventually add this utility function anyway.

Another reason might be performance of course, doing one scan and indexing instead of two, though I concede that's less problematic in general.

Or alternatively, (->> coll (map vf) (index-by kf)) would work as well, no? And that's only four characters longer.

No, because in vf you typically do a projection that loses the information required for indexing by kf, for instance (def user-id->name (index-and-map-by :id :name users)).

weavejester commented 4 years ago

No, because in vf you typically do a projection that loses the information required for indexing by kf

Right. So map-vals needs to be used, which means that a single function would save 9 characters, and using map and juxt would save 4:

(index-and-map-by :id :name coll)
(into {} (map (juxt :id :name)) coll)
(->> coll (index-by :id) (map-vals :name))

To my eyes the last one is the clearest in terms of intent, though likely also the least performant. I'll need to think about this one for a while.

borkdude commented 4 years ago

Maybe and in the name indications some kind of permutation problem. Maybe I'd also like to have index-and-keep-by, etc. I don't think medley can possibly host every possible combination of functions?

borkdude commented 4 years ago
(defn index-and-hof-by
  [hof kf vf coll]
  (into {} 
    (f 
      (fn [x]
        [(kf x) (vf x)]))
    coll))
vvvvalvalval commented 4 years ago

@borkdude arguable, at this point you're just looking for transduce :)

I'm really only asking for a narrow but common use case.

vvvvalvalval commented 4 years ago

@weavejester I would argue that maybe characters count is not the best proxy for clarity in this case?

Even if (->> coll (index-by :id) (map-vals :name)) is only 9 characters longer, it means that the reader has to work out the overarching indexing intention by interpreting the two function calls in her head.

If we agreed that this pattern is frequent enough that the reader finds it worth memorizing by the index-and-map-by name, then (index-and-map-by :id :name coll) would be the most readable presentation, right?

weavejester commented 4 years ago

I personally don't find enough use for the function to be worth memorizing, so I'd find index-by and map-vals to be easier to understand. The use of the separate functions also makes it clear which key is being used for indexing, and which key is being used to map over the values.

vvvvalvalval commented 4 years ago

@weavejester your call, there's no arguing against personal judgement. I guess I'll keep writing this function on every project :) thanks for considering it!

weavejester commented 4 years ago

Give me some time to think about it. There may also be performance improvements, or another way to write it. I'll leave this issue open.

vvvvalvalval commented 4 years ago

If that helps, the way I would write it for performance is:

(defn index-and-map-by
  [kf vf coll]
  (persistent!
    (reduce
      (fn [tm x]
        (let [k (kf x)
              v (vf x)]
          (assoc! tm k v)))
      (transient {})
      coll)))