mkodekar / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Make AbstractMultiset + Multisets.AbstractEntry public #157

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
If you want to implement your own multiset, doing so requires a fair bit of
code. In some cases, a ForwardingMultiset isn't appropriate for the task
(in my use case I was creating a multiset class that represented a cross
product of other multisets).

These classes seem well thought out wrt being implemented by other classes,
and therefore good candidates for being public.

Original issue reported on code.google.com by ben.maurer on 2 May 2009 at 8:54

GoogleCodeExporter commented 9 years ago
What multiset implementation would you want, beyond those that are already 
provided?

We'd need some use cases to justify making any additional classes public.

Original comment by jared.l....@gmail.com on 2 May 2009 at 9:16

GoogleCodeExporter commented 9 years ago
The use case I implemented was a multiset that is the product of other 
multisets. Eg,
given:

[a x 1, b x 2]
[c x 3, d x 4]

return [ac x 3] [ad x 4] [bc x 6] [bd x 8]

Other use cases include implementing the sum of multisets lazily, or using a 
counting
bloom filter to implement the multiset.

Original comment by ben.maurer on 2 May 2009 at 9:59

GoogleCodeExporter commented 9 years ago
Making our skeletal implementation classes like AbstractMultiset public is 
something
I've always figured we may eventually get around to, but it is a lot of work, 
and the
kind of work that isn't fun.  So every additional strongly compelling use case 
helps.

Original comment by kevin...@gmail.com on 5 May 2009 at 3:59

GoogleCodeExporter commented 9 years ago

Original comment by kevin...@gmail.com on 19 Aug 2009 at 1:32

GoogleCodeExporter commented 9 years ago

Original comment by kevin...@gmail.com on 17 Sep 2009 at 5:45

GoogleCodeExporter commented 9 years ago

Original comment by kevin...@gmail.com on 17 Sep 2009 at 5:57

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 30 Jul 2010 at 3:53

GoogleCodeExporter commented 9 years ago
Guava 7 augments ForwardingMultiset with a number of "sensible default" 
implementations of methods.  Would this suffice for your needs?

Original comment by wasserman.louis on 14 Feb 2011 at 6:59

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 13 Jul 2011 at 6:18

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 10 Dec 2011 at 3:37

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 16 Feb 2012 at 7:17

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 30 May 2012 at 7:43

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 22 Jun 2012 at 6:16

GoogleCodeExporter commented 9 years ago
I tried to extend AbstractMultiset and failed, too.  Here's my use case:

I want a multiset where it's possible to delete multiple, different entries 
quickly.  For example,

class Words extends AbstractMultiset<String> {
  boolean removeByLetter(char c1) {
    // remove all Strings that contain the letter c1
  }
}

Iterating over the Strings of the multiset, even using the entryIterator(), 
isn't as fast as storing a separate hashmap that maps from letter to 
words_containing_that_letter.

Without being able to extend AbstractMultiset, I have to resign myself to:

1) Rewrite AbstractMultiset: extend AbstractCollection, implement Multiset, and 
write a lot of boilerplate code.  This sucks, I should just be using 
AbstractMultiset.

2) Don't extend nor implement, just encapsulate a Multiset.  But now, I'll need 
to provide a getMultiset() function so that I can still use the Multiset 
features, eg myWords.getMultiset().entrySet().  Ugly.  But that function will 
have to create a deep-copy of my data because if someone modifies the data 
without updating the words_containing_that_letter, my data will break, so 
really it should be myWords.getImmutableMultiset().  The copy will have to be 
deep because the inner members, if they were objects instead of Strings, could 
be modified, and then the "Immutable" would be a half-lie.

This class should be public.  AbstractSet, AbstractCollection, etc, are public.

Original comment by eyals...@google.com on 7 Nov 2013 at 10:49

GoogleCodeExporter commented 9 years ago
Can you explain why ForwardingMultiset doesn't work for you?

Original comment by kevinb@google.com on 7 Nov 2013 at 2:56

GoogleCodeExporter commented 9 years ago
I'm not familiar with ForwardingMultiset...  It seems that it might work

From what I read, it just calls all the methods of the delegate.  I don't see 
how this is better: it's just encapsulation instead of inheritance.  What was 
the purpose?  ForwardingMultiset appears unused...

Original comment by eyals...@google.com on 7 Nov 2013 at 3:22

GoogleCodeExporter commented 9 years ago
https://code.google.com/p/guava-libraries/wiki/CollectionHelpersExplained#Forwar
ding_Decorators

Original comment by SeanPFl...@googlemail.com on 7 Nov 2013 at 3:42

GoogleCodeExporter commented 9 years ago
I read that URL.  Thanks.  It doesn't explain why Forwarding Decorators are 
better than using, for example, AbstractMultiset.  And now I have to override 
add(Object, int) *AND* add(Object), which is just more boilerplate for me!

What is the advantage here?

Original comment by eyals...@google.com on 7 Nov 2013 at 3:56

GoogleCodeExporter commented 9 years ago
The place to start is probably Effective Java's "Item 16: Favor composition 
over inheritance."

Original comment by cpov...@google.com on 7 Nov 2013 at 4:36

GoogleCodeExporter commented 9 years ago
If you extended AbstractMultiset, you'd have to e.g. reimplement entrySet() for 
yourself.  In both cases you have to override something, but I would tend to 
say that on balance, ForwardingMultiset requires less work for this case in any 
event.

Original comment by lowas...@google.com on 7 Nov 2013 at 5:22

GoogleCodeExporter commented 9 years ago
Also note that you can compose with any backing Multiset, so changing to one 
with sorted keys, for example, is trivial.

Our AbstractMultiset class is an *implementation*, which we wish to reserve the 
right to change at will, as doing so benefits all its consumers.  But making it 
public would force us to commit to many aspects of that implementation.  The 
kinds of implementation changes we might need to make would risk breaking users.

I'm actually going to close this issue now. My plea that "every additional 
strongly compelling use case helps" went unanswered for *four and half years* 
(a period of steady growth in Guava's popularity). If there were a reason 
1/10th as strong as we'd need, we would know it by now.

Original comment by kevinb@google.com on 7 Nov 2013 at 5:29

GoogleCodeExporter commented 9 years ago
The word "Favor" implies that I'd have a choice but here my hand is forced.  

On the one hand, I should favor composition so AbstractMultiset is private.  On 
the other hand, it's extended by many classes within the library to make all 
the different types of Multisets, so clearly inheriting from it isn't all 
*THAT* wrong.

Guava has deviated here from the rest of the AbstractWhatevers that are public 
in java.util.*

So I'm stuck with inheriting from ForwardingMultiset and obeying the 
if-you-override-a-you-must-override-b for all the add()/remove(), etc.  Such a 
waste.  :-(

Original comment by eyals...@google.com on 7 Nov 2013 at 6:36

GoogleCodeExporter commented 9 years ago
@eyalsoha: Did you see Louis's comment above about how you'd have to implement 
entrySet() for yourself if you were to extend AbstractMultiset? That's much 
more complex than having to override a few methods I'd say.

> Guava has deviated here from the rest of the AbstractWhatevers that are 
public in java.util.*

I think we're perfectly fine with that... the design of Guava's collection 
libraries was directly informed by experience from the JDK libraries, including 
things that were probably mistakes.

Original comment by cgdecker@google.com on 7 Nov 2013 at 6:47

GoogleCodeExporter commented 9 years ago
I hadn't seen Louis' comment before my post.  But I'd still rather override 
entrySet(), and if I forget, get an exception, than use ForwardingMultiset, 
override add(Object), and if I forget to also override add(Object,int), 
addAll()..., sliently have my data lose consistency.

AbstractIterator was a nice function for me: implement the Iterator interface 
but with less code.  I thought that AbstractMultiset would do that, too.  
Instead there's ForwardingMultiset, which isn't saving me lines of code and 
risky if I miss an override.  I'm safer just implementing the Multiset 
interface and skipping ForwardingMultiset.  I'm having a hard time imagining a 
case where using ForwardingMultiset *would* be advantageous.

Original comment by eyals...@google.com on 7 Nov 2013 at 7:51

GoogleCodeExporter commented 9 years ago
Even with AbstractMultiset, you still have to override both add(Object) and 
add(Object, int), since you don't know which one of the two calls the other in 
the implementation. Or rather, you might be able to check which one of the two 
calls the other currently, but this could switch at any time. What you're 
asking is more than just for us to insert the word "public" to the class; what 
you're asking is that we commit to the implementation details of our Multiset 
implementations permanently. That's something that we could do if there were 
enough need, but it's a big project.

Original comment by cpov...@google.com on 7 Nov 2013 at 7:55

GoogleCodeExporter commented 9 years ago
Without that commitment, the code is useless.  My only alternative is 
ForwardingMultiset, which is too dangerous because it exposes the underlying 
Multiset without maintaining my invariants.

I don't know which is the lesser of two evils:

1) Override all the methods myself (essentially re-implementing 
AbstractMultiset).
2) Copy AbstractMultiset into my package and use it.

In my situation, what would you suggest?

Original comment by eyals...@google.com on 11 Nov 2013 at 4:02

GoogleCodeExporter commented 9 years ago
The delegate() method that exposes the underlying Multiset is |protected|, and 
your ForwardingMultiset subclass can create the Multiset itself, so it's safe 
from external users.

Original comment by cpov...@google.com on 11 Nov 2013 at 4:46

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Imagine my previous example: I have a Multiset<String>, but I want to add to it 
public functions to let me get a Multiset<String> based on letter.  I want it 
to work quickly so I maintain a Map<char, Set<String>> that maps from letters 
A-Z to the elements in the Multiset and provide an iterator 
iteratorByLetter(char c) that returns letterMap.get(c).  I've potentially 
doubled the size of my data in order to gain speed for my added functions.

To keep the map valid, I need to intercept add and for each add to the 
Multiset, add to the map.  Same for remove.  So I override 
ForwardingMultiset#add(E,int) and ForwardingMultiset#remove(Object,int).  These 
are the only two public functions in ForwardingMultiset, so I think that I'm 
covered.

However, if I forget to also override ForwardingMultiset#add(E), then I get in 
trouble.  In that case, my underlying Multiset will be updated by the call to 
delegate().add() but my map won't be updated!  My class will corrupt itself 
silently.  The only way to ensure that I override ForwardingMultiset correctly 
is to follow the chain of all the super-classes and make sure that I don't miss 
a function.  That is a pain.  Same goes for remove() and whatever else might be 
in there.  The documentation for ForwardingMultiset explains as much.  The 
documentation says that if you override add(E), you must override add(E,int).  
So you write the latter and the former is an exact copy of add(E) from 
AbstractMultiset.  In fact, to use ForwardingMultiset, I basically have to 
copy-paste all the code from AbstractMultiset except for the abstract 
functions, which I'll implement.

Original comment by eyals...@google.com on 11 Nov 2013 at 5:46

GoogleCodeExporter commented 9 years ago
>In fact, to use ForwardingMultiset, I basically have to copy-paste all the 
code from AbstractMultiset except for the abstract functions, which I'll 
implement.

That's why the standardXYZ methods are there: to provide the implementations 
you'd get from AbstractMultiset.

Original comment by lowas...@google.com on 11 Nov 2013 at 5:56

GoogleCodeExporter commented 9 years ago
How can I use standardAdd() to help me here?  I want to write a single function 
that will extend all the different kinds of add().  Also, I want to write a 
single function to override all the different kinds of remove().

What function am I supposed to override?  The documentation tells me that 
overriding just one of the add() functions won't work.  I should override 
standardAdd()?

Original comment by eyals...@google.com on 11 Nov 2013 at 7:02

GoogleCodeExporter commented 9 years ago
@eyals...: See an example class I attached. It's just a ForwardingMultiset 
demo, not a bullet proof, ready to use solution to your problem (you should 
test it well by yourself, maybe using Guava testlib?). If I missed something 
obvious, please forgive me - it's quite late (in my timezone).

> How can I use standardAdd() to help me here? 

standardXyz methods are protected members of each ForwardingAbc class in Guava 
which *can* be used in your class (it's optional because sometimes you *don't* 
want add and addAll to behave in the same manner). These methods are well 
documented, just read them all before (or while) extending ForwardingMultiset - 
typically it's something like:

   "A sensible definition of FOO in terms of BAR. If you override BAR, you may wish to override FOO to forward to this implementation."

In your case just write add(String, int) implementation and use 
standardAdd(String) and standardAddAll(Collection), plus override 
remove(Object, int) and use standardRemove(Object). There are also some 
pitfalls with setCount, clear and entry/elementSet (I've made these return 
unmodifiable Sets), but you'd have same issues extending AbstractMultiset.

Which brings me to this issue's purpose. Everything what can be done with 
AbstractMultiset can also be done with ForwardingMultiset and both would 
require care. On the other hand JDK collections don't have forwarding decorator 
at all, just skeleton abstract implementations. Joshua Bloch, who wrote 
Effective Java referenced earlier in the comments, is also creator of JDK's 
collections framework and AFAIK also Googler, so I'm curious what's his point 
of view on this particular issue (and BTW, why Multiset aka Bag wasn't included 
in JDK). 

Anyway, if you have more questions about your particular case (not about making 
AbstractMultiset public), ask on Guava discussion group or on Stackoverflow.

Original comment by xaerx...@gmail.com on 11 Nov 2013 at 9:46

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for your hard work!  That's a nice class and would certainly work.  But 
it also serves to demonstrate the problems with ForwardingMultiset.

In your class, you've defined add(String), add(String,int), and 
addAll(Collection<>), the latter two in terms of standardAdd*, the former, not. 
 It would be very easy for me, as an extender of the class, to forget one of 
those or to use standardAdd in the wrong place.  And if I make a mistake, the 
default behavior would not update letterMap; it would instead silently make my 
data inconsistent.  I'd rather throw an exception than silently fail later.

A nicer version of ForwardingMultiset or AbstractMultiset would have 
add(String,int) be abstract so that the user *must* override it and then 
provide defaults of add(String), addAll(Collection<String>), etc, that call 
add(String,int).  The user would then have to redefine just one function.  If 
he forgot, he would get a compile-time error.  (Excellent!)  And if he wanted 
to customize the others for speed, he could.

People talk about preferring encapsulation over inheritance but it's a 
preference, not a strict rule.  This is a perfect example when inheritance is 
the right thing to do: it obeys Liskov's Substitution Principle to a tee.  The 
"encapsulate don't inherit" rule was invented as a backlash to the misuse of 
inheritance, but now the pendulum has swung back the other way and we've got 
dozens of Forwarding____ classes that don't save programmers any boilerplate 
code.

For my uses, I gave up on implementing Multiset<>.  "implements Multiset<>" 
would require writing too much code and "extends ForwardingMultiset<>" doesn't 
save me enough typing for the danger that is engenders.

Original comment by eyals...@google.com on 12 Nov 2013 at 9:44

GoogleCodeExporter commented 9 years ago
For what it's worth, I share your concerns with the forwarding collections, as 
do others on the team. We have a long-standing internal issue filed to 
investigate whether there is a better way.[*] For your case, that better way 
probably is an improved AbstractMultiset, but given the soft demand for it over 
the years, it's unlikely to be a priority for us soon. Sorry.

[*] Come to think of it, I should move that to the external tracker. Done: 
http://code.google.com/p/guava-libraries/issues/detail?id=1575

Original comment by cpov...@google.com on 12 Nov 2013 at 4:59

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:16

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:10