Closed GoogleCodeExporter closed 9 years ago
Oops, forgot the interface:
public interface MapBuilderOp<V, Velem> {
/** Left-associative application of this operation */
public V applyLeftAssoc(V lVal, Velem rVal);
/** Create a new identity object */
public V newIdentityUnderOp();
}
Original comment by luke.hutch
on 28 Nov 2007 at 11:06
PS I should say that applyLeftAssoc is really an example of foldl, for those
with
Haskell experience :-) It might make sense to look extensively at Haskell's
prelude,
and cover foldr, the various zip* functions, 'map', etc. -- as inevitably the
need
will come up for these other routines, but they are currently being
hacked-around on
an as-needed basis by Java programmers -- just as Map<K, ? extends Collection>
was
hacked around prior to the Google Collections Library's existence.
Original comment by luke.hutch
on 28 Nov 2007 at 11:21
My first instinct is that if you had a Map whose get() method auto-creates a
"zero
value" (using your supplied creation function) whenever the value is missing,
you'd
have everything you need, don't you think? In your example, you wouldn't use
Integer
as your value type, but some custom class of yours that has the methods you
need like
setIfGreater(). Now, probably I'm wrong about what you're really trying to do,
but
please explain why - thanks.
It's interesting that Java's ThreadLocal bothered to support this
"autovivification"
but no collections do.
Original comment by kevin...@gmail.com
on 28 Nov 2007 at 11:51
No, the point is that you often need to perform an arbitrary operation on the
existing value in a map, if present, or do something else if not present.
MultiMaps
create new lists if no value present for the key, and store the new value in
the new
list, else append to the old list at that key.
The first code example I gave does something very similar, but with the max
operation
-- and yet MultiMaps can't compute the max of values with a given key, they are
too
specialized to work on lists.
The MapBuilder code I showed can do both MultiMaps and compute max, as well as
product, sum, avg, anything else for which you can define an operation, and an
identity for the operation. Therefore MapBuilder is an abstraction of
MultiMaps, and
is more powerful than MultiMaps (it can be used to simply implement MultiMaps
as well
as many other things).
In general this concept is known as foldl in the functional programming world,
and
once you realize that MultiMaps are in some sense a foldl after grouping k/v
pairs by
key, then it follows that there may be some other really powerful utility
classes
created, by looking at other functional programming paradigms, that extend
collections in ways as powerful as the leap from Map->MultiMap.
Just defining a get() that creates a zero-value rather than returning null if
there
is no value for the key is not a solution to the bigger picture.
Original comment by luke.hutch
on 29 Nov 2007 at 12:08
I'm not sure why I should prefer your
mapBuilder.put("a", 2)
over my
map.get("a").performSomeOperation(2)
To me, the second is far more simple and clear, requiring only something so
simple as
an "autovivifying map".
What am I missing?
Original comment by kevin...@gmail.com
on 29 Nov 2007 at 12:21
Also, I don't see Multimap as a special case of what you describe at all. A
Multimap
is just a collection; a bunch of key-value pairs, just like a Map but without
the
constraint that two entries can't have the same key. And it has some useful
view
collections, just as Map does.
What you're talking about is completely different; it only seems similar to a
Multimap if you're thinking in terms of how Multimaps are *implemented*.
Original comment by kevin...@gmail.com
on 29 Nov 2007 at 12:32
Your example works for mutable objects, so if map is an autovivifying map from
string
to list, you can do
map.get("a").add("new val")
However if these are immutable objects, then you need to do something like the
following, e.g. for a map from string to integer:
map.put("a", Math.max(map.get("a"), newVal))
What I am proposing is something like
map = MapBuilder<Integer>.createMaxMap(); // Where this is defined similarly
to map1
above
map.put("a", 10);
map.put("a", 1); // etc.
I think this notation is a lot clearer -- you just create a map whose put
operation
applies a binary op, and uses the identity if only one operand is available --
but
the reason why I suggest it is because it will cut down on an enormous amount of
boilerplate code that I have been writing every time I want to use a map with a
non-trivial valuespace, or where I want to do more than just .add() in the
traditional sense.
Original comment by luke.hutch
on 29 Nov 2007 at 12:33
...and the only win you get from using autovivifying maps for the example
map.put("a", Math.max(map.get("a"), newVal))
is avoiding one test:
val = map.get("a");
map.put("a", Math.max(val == null ? identity : val, newVal))
This is still helpful, and enough of a reason to have an autovivifying get(),
but the
real win is avoiding the sort of boilerplate I was talking about originally:
public static void mapPutMax(HashMap<String, Integer> map, String key, int
newVal) {
int max = newVal;
if (map.containsKey(key))
max = Math.max(map.get(key), max);
map.put(key, max);
}
Here's the ArrayList version for comparison -- this pattern is all over my code
and
other people's, and it's tedious to write all the time:
public static void mapPutMax(HashMap<String, ArrayList<Integer>> map, String key,
int newVal) {
ArrayList<Integer> list = map.get(key);
if (list == null)
list = new ArrayList<Integer>();
list.add(newVal);
map.put(key, list);
}
Avoiding this second code snippet is (if I understand right) one of the very
reasons
MultiMaps exist. However this follows the same sort of pattern as the first
snippet
-- suggesting that there is a useful generalization here. Basic group theory
suggests the answer lies in realizing that both operations have in common an
associative binary operation and an identity under that operation.
Original comment by luke.hutch
on 29 Nov 2007 at 12:40
Just saw comment #6. I understand your abstract definition of MultiMaps, and
that
they cover a bigger picture than this by providing view collections etc. This
is not
however so much about how MultiMaps are implemented as it is about how to get
values
into them in a convenient way... I'm suggesting an aspect of the interface, not
the
implementation. This is completely backwards-compatible with the current
interface,
and just provides a unification between maps and multimaps. It also makes
value/value maps as useful as multimaps for cases where you want to do more
than just
put/overwrite.
What I'm suggesting would probably fit better in the interface of Map itself --
in
fact you could define the standard Map.put(k, v) semantics (putting or
overwriting
depending on whether a value is already present) using this same MapBuilder
model, by
defining the identity as null, and the operation to ignore the old value, and
put the
new value.
Original comment by luke.hutch
on 29 Nov 2007 at 12:52
"Your example works for mutable objects... however if these are immutable
objects..."
Then why use the immutable objects, if by switching to mutable ones that you
craft
you can avoid all of this "MapBuilder" complexity? For example this is what I
do in
HashMultiset.
I'm sorry but I don't think that something like your MapBuilder belongs in this
library.
Original comment by kevin...@gmail.com
on 29 Nov 2007 at 1:10
Original comment by kevin...@gmail.com
on 29 Nov 2007 at 1:13
Original issue reported on code.google.com by
luke.hutch
on 28 Nov 2007 at 11:05