maidh91 / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Iterables.limit/skip(Iterable<T>, Predicate<? super T>) #477

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Please add
* Iterables.limit(Iterable<T>, Predicate<? super T>)
* Iterables.skip(Iterable<T>, Predicate<? super T>)

(possible other names: limitFrom and skipUntil)

Use case is linked to values generated on the fly and the hasNext() method of 
the base iterator will never return false:
* limit: hasNext() returns false once at least one element doesn't apply for a 
predicate
* skip: elements are discarded until at least one satisfy the predicate.

I can't use the filter method because I use open-ended iterable and therefore 
the filter method might not end.

Basic example:

static Predicate<Integer> lessThan(int x); // Obvious predicate

static Iterable<Integer> primes(); // All prime numbers, open-ended so 
hasNext() will never return false. (for the sake of the test, let's say there 
is no bound for int values.)

for (int i: limit(primes(), lessThan(3000))) {
  assert i < 3000;
  // Do anything.
}

for (int i: skip(primes(), lessThan(3000))) {
  assert i >= 3000;
  // Do anything.
}

Original issue reported on code.google.com by ogregoire on 12 Nov 2010 at 2:26

GoogleCodeExporter commented 9 years ago
Understood.  May need more motivation.  More use cases out there?

Original comment by kevinb@google.com on 12 Jan 2011 at 10:58

GoogleCodeExporter commented 9 years ago
In LINQ (and also Haskell, which I assume is where the names came from) these 
methods are called Enumerable.TakeWhile and Enumerable.DropWhile (the latter 
has the predicate inverted)

Original comment by fin...@gmail.com on 12 Jan 2011 at 11:53

GoogleCodeExporter commented 9 years ago
Having `takeWhile` and `dropWhile` would be great.

However, note that both in Haskell¹ and Python² the first argument is the 
predicate while the iterable comes second. This is contrary to almost all 
methods of `Iterables`. I guess following Guava's style is preferred here, 
though, to keep things consistent here.

¹) http://learnyouahaskell.com/higher-order-functions#maps-and-filters (search 
for "takeWhile")
²) http://docs.python.org/library/itertools.html#itertools.takewhile

P.S.: The resource at ¹) contains the aforementioned motivation, too:
"We use takeWhile here instead of filter because filter doesn't work on 
infinite lists."

Original comment by j...@nwsnet.de on 13 Jan 2011 at 9:17

GoogleCodeExporter commented 9 years ago
Actually the documentation there is factually incorrect, what it was supposed 
to say was "We use takeWhile here instead of filter because we want it to stop 
searching when it finds a non satisfying element". Both filter and takeWhile 
will fail to produce all elements of an infinite list (aka stream) but 
takeWhile will break early if it finds an element that doesn't match anymore. 
For example, both have the same behavior here:

takeWhile (< 2) (repeat 1) === filter (< 2) (repeat 1)

Also both work on infinite lists, you just can't fold it entirely:

take 100 (takeWhile (< 2) (repeat 1)) === take 100 (filter (< 2) (repeat 1))

The usefulness of take/dropWhile is when we use want to combine basic 
generators given some restrictions:

takeWhile(filter(isSquare(), aSequence), lessThan(1000))

It'll give us all square numbers less than 1000 from aSequence.

Original comment by daniel.y...@gmail.com on 13 Jan 2011 at 10:05

GoogleCodeExporter commented 9 years ago
Thanks for the clarification.

This makes me wonder if there is a way (that makes sense) to protect using e.g. 
`filter` or `find` with infinite iterables in Java. I'm still looking forward 
for the `Range` class in Guava, and I assume that it can be open-ended. Is that 
a common problem to tackle or is leaving getting things right to the developer 
sufficient?

Original comment by j...@nwsnet.de on 14 Jan 2011 at 9:32

GoogleCodeExporter commented 9 years ago
Here's another use case for your motivation:
From an iterable of items sorted(!) by their date attribute, I'm looking for 
those before/after a certain date. Hence, e.g.

{{{
final long someTimestamp = 4815162342L;
Iterables.takeWhile(items, new Predicate<Item>() {
  public boolean apply(Item item) {
    return item.getCreatedAt().getTime() == someTimestamp;
  }
}
}}}

Original comment by j...@nwsnet.de on 11 May 2011 at 7:25

GoogleCodeExporter commented 9 years ago
(Sorry, but I don't get that syntax thing here. Might only work for actual wiki 
pages. A link to the docs as well as an on-the-fly preview [not that uncommon 
these days] might help here.)

Original comment by j...@nwsnet.de on 11 May 2011 at 7:28

GoogleCodeExporter commented 9 years ago
Argh, of course it would be `item.getCreatedAt().getTime() > someTimestamp;` or 
similar.

Original comment by j...@nwsnet.de on 11 May 2011 at 7:29

GoogleCodeExporter commented 9 years ago
Issue 619 has been merged into this issue.

Original comment by cgdec...@gmail.com on 11 May 2011 at 8:18

GoogleCodeExporter commented 9 years ago
kevinb: Is there a chance `limitWhile` and `skipWhile` (or whatever suits best) 
will appear in r10? Do you need more use cases for these methods I'd consider 
common?

Original comment by j...@nwsnet.de on 6 Jun 2011 at 2:32

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 13 Jul 2011 at 6:18

GoogleCodeExporter commented 9 years ago

Original comment by cpov...@google.com on 13 Jul 2011 at 8:33

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 10 Dec 2011 at 3:54

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 16 Feb 2012 at 7:17

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 30 May 2012 at 7:43

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 22 Jun 2012 at 6:16

GoogleCodeExporter commented 9 years ago
Issue 1048 has been merged into this issue.

Original comment by wasserman.louis on 2 Jul 2012 at 8:49

GoogleCodeExporter commented 9 years ago
Any news on this?

I can't provide a *new* use case, just the one again I've already shown above.

Original comment by j...@nwsnet.de on 30 Jul 2012 at 7:52

GoogleCodeExporter commented 9 years ago
The given examples here seem to fall under the umbrella of "ordered data," for 
which we can optimize Iterator operations.  ("Optimize" might be an 
understatement in the case of infinite inputs.... :))  If that's typical, then 
perhaps Comparator makes more sense than Predicate:

Iterables.head(Iterable, Comparator<T>, T)
Iterables.tail(Iterable, Comparator<T>, T)

Internally, we once had an OrderedIterator type that exposed operations like 
these (though not these two in particular, IIRC).  It never really caught on, 
though it was part of the motivation for Iterables.mergeSorted.

I see a handful of results for 'hasNext[(][)] && .*compareTo' on a regex search 
over the Google codebase, plus others that look promising for 'hasNext[(][)] && 
.*[.].*<' and 'hasNext[(][)] && .*peek.*<'.

Am I right to focus on the Comparator case?  (Maybe we'll need to revisit the 
idea of OrderedIterator....)

Original comment by cpov...@google.com on 2 Aug 2012 at 9:08

GoogleCodeExporter commented 9 years ago
Ugh, that's going to have a problem.  You'll have to write:

head(records, timeComparator(), newDummyRecordWithTime(time))

But you want to write:

head(records, timeOfRecord(), time)

In full generality, that's:

head(records, timeOfRecord(), Ordering.natural(), time)

But perhaps no one will require a non-natural ordering... until someone needs 
Ordering.natural().reverse()....

A plain Predicate doesn't have this problem -- though it's also less likely 
that someone will have an appropriate Predicate sitting around than an 
appropriate Comparator or even Function.

Original comment by cpov...@google.com on 2 Aug 2012 at 9:12

GoogleCodeExporter commented 9 years ago
The use case of comment #19 can *almost* be handled using the Ordering class, 
except that the #min and #max methods that take an Iterable do not accept a 
default value. I.e., it seems to me one would want to write the following:

Ordering.from(Comparable<T>).min(Iterable<T>, T);
Ordering.from(Comparable<T>).max(Iterable<T>, T);

(But I appreciate this comment is slightly off topic.)

Original comment by stephan...@gmail.com on 2 Aug 2012 at 9:44

GoogleCodeExporter commented 9 years ago
I found myself wanting this today, where it had nothing to do with ordering.  I 
have a case where a Version is backed by a List<Integer> (highest significance 
first).  I need to consider [1, 2, 0] as the same as [1, 2].  This is easy 
enough for comparisons, but I also want those two Versions to be equal(), which 
is still easy as it's a comparison.  But then I need to implement hashCode and 
it gets difficult.  At that point, I need to be able to normalize.

So I want to be able to say "take until the remaining elements are all 0".  In 
Scala, that would be idiomatically

segments.reverse.dropWhile(_ == 0).reverse

Obviously if this were a Stream-like structure, or a cons list, this wouldn't 
be efficient.  But with LinkedList and ArrayList this is actually really 
efficient as reverse can be implemented as a live view.

If limit/skip or takeWhile/dropWhile were added to Iterables, this could be 
done in a similar way.  I think the current closest alternative would be 
something like "reverse, get the first index of the inverse predicate, 
translate it into a last index of the unreversed list, and then do a subList on 
the unreversed list."

Original comment by markpete...@gmail.com on 9 May 2013 at 8:50

GoogleCodeExporter commented 9 years ago
I come across uses for `limit` and `skip` every once in a while (and ordering 
doesn't help there).

Original comment by j...@nwsnet.de on 26 Jul 2013 at 4:39

GoogleCodeExporter commented 9 years ago
> But then I need to implement hashCode and it gets difficult.  At that point, 
I need to be able to normalize.

You don't.

int hash = 0; for (int i : reverse(list)) hash = 43 * hash + i;

> I think the current closest alternative...

IMHO currently anything but a simple loop makes no sense.

while (!list.isEmpty() && list.get(list.size() - 1 == 0) 
list.remove(list.size() - 1);

Original comment by Maaarti...@gmail.com on 27 Jul 2013 at 6:39

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:15

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:18

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:09