DaveAKing / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Iterables.removeIf wrong items removed when ArrayList is used #1543

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,

I have found strange behaviour of Iterables.removeIf method. When I remove 
items from iterable by frequency and having them nearby, every found duplicate 
item is removed even the last one. At a bottom you can find test snippets. 
There is difference between use of ArrayList and LinkedList.

Note: I know that I should use Set for this kind of situations, but imagine 
that you do not have String list but list full of legacy uneditable s*it 
classes. Problem is inside Iterables.removeIfFromRandomAccessList method.

<code language="java">
    @Test
    public void testOk() throws Exception {

        final List<String> list = Lists.newArrayList("a", "b", "a");

        System.out.println(list); // [a, b, a]
        Iterables.removeIf(list, new Predicate<String>() {
            @Override
            public boolean apply(String input) {
                return Iterables.frequency(list, input) >= 2;
            }
        });
        System.out.println(list); // [b, a]

    }

    @Test
    public void testWrong() throws Exception {

        final List<String> list = Lists.newArrayList("a", "a", "a");

        System.out.println(list); // [a, a, a]
        Iterables.removeIf(list, new Predicate<String>() {
            @Override
            public boolean apply(String input) {
                return Iterables.frequency(list, input) >= 2;
            }
        });
        System.out.println(list); // [] should be [a]

    }

    @Test
    public void testOkWithLinkedList() throws Exception {

        final List<String> list = Lists.newLinkedList();
        list.add("a");
        list.add("a");
        list.add("a");

        System.out.println(list); // [a, a, a]
        Iterables.removeIf(list, new Predicate<String>() {
            @Override
            public boolean apply(String input) {
                return Iterables.frequency(list, input) >= 2;
            }
        });
        System.out.println(list); // [a]

    }
</code>

Original issue reported on code.google.com by bedla.cz...@gmail.com on 25 Sep 2013 at 11:58

GoogleCodeExporter commented 9 years ago
Hmm. I wouldn't necessarily say that the behavior of removeIf with an ArrayList 
is wrong in this case. The list has 3 "a" in it. 3 is >= 2, so all "a" should 
be removed. That said, it is a bit weird that we have different behavior 
depending on if the list is random access or not. Though I also think that 
using a Predicate whose behavior will change during the operation is generally 
not a good idea.

Original comment by cgdecker@google.com on 25 Sep 2013 at 1:12

GoogleCodeExporter commented 9 years ago
I agree, but my intention was to work with iterables and inner collection could 
change during iteration. I think that Iterator part of algorithm works as 
intuitively expected, but as you mentioned RandomAccess is problem.

Original comment by bedla.cz...@gmail.com on 25 Sep 2013 at 1:24

GoogleCodeExporter commented 9 years ago
I'd say both outcomes can be justified and I'd for sure avoid such a behavior 
changing Predicate as I could imagine different (stupid) implementations 
throwing ConcurrentModificationException or IndexOutOfBoundException.

Moreover, doing the above is O(n*n), while O(n) is doable. For the "wrong" 
behavior, it's pretty trivial, just copy the Iterable into a Multiset used by 
the Predicate. For the right behavior, you need to remove a given number of 
occurrences... no idea if there's an elegant way.

Original comment by Maaarti...@gmail.com on 25 Sep 2013 at 5:23

GoogleCodeExporter commented 9 years ago
I do not dispute stupidity (wrong complexity) of test case I have provided. I 
am pointing out differences when RandomAccess is used. :-)

Original comment by bedla.cz...@gmail.com on 25 Sep 2013 at 5:33

GoogleCodeExporter commented 9 years ago
With the concurrence of Kevin and Christian, I'm categorizing this as a docs 
issue: the behavior of these methods in the presence of "badly behaved" 
Predicates *should be* left undefined, but we could make that fact more obvious.

Original comment by lowas...@google.com on 25 Sep 2013 at 8:48

GoogleCodeExporter commented 9 years ago
Ok, fair enough. But could you add RandomAccess part of algorithm as optional? 
It would be nice to choose to use "wrong" behaviour when expected (I like it 
now, but I dont know if I will like it tomorrow).

Note: this behaviour is OK in functional immutable world, but when mutable Java 
world takes place, it is little bit confusing (we have to study more :) ). It 
looks like that they are consitent with your implementation in Java8. I mean 
ArrayList.removeIf vs. Collection.removeIf impls.

Original comment by bedla.cz...@gmail.com on 26 Sep 2013 at 7:58

GoogleCodeExporter commented 9 years ago
As a workaround, you could wrap your input in a ForwardingList that doesn't 
implement RandomAccess.

Original comment by cpov...@google.com on 26 Sep 2013 at 2:07

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<issue id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:12

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:17

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:08