Iterables.limit/skip(Iterable<T>, Predicate<? super T>)

gissuebot commented 9 years ago

Original issue created by ogregoire on 2010-11-12 at 02:26 PM

Please add

Iterables.limit(Iterable<T>, Predicate<? super T>)
Iterables.skip(Iterable<T>, Predicate<? super T>)

(possible other names: limitFrom and skipUntil)

Use case is linked to values generated on the fly and the hasNext() method of the base iterator will never return false:

limit: hasNext() returns false once at least one element doesn't apply for a predicate
skip: elements are discarded until at least one satisfy the predicate.

I can't use the filter method because I use open-ended iterable and therefore the filter method might not end.

Basic example:

static Predicate<Integer> lessThan(int x); // Obvious predicate

static Iterable<Integer> primes(); // All prime numbers, open-ended so hasNext() will never return false. (for the sake of the test, let's say there is no bound for int values.)

for (int i: limit(primes(), lessThan(3000))) { assert i < 3000; // Do anything. }

for (int i: skip(primes(), lessThan(3000))) { assert i >= 3000; // Do anything. }

gissuebot commented 9 years ago

Original comment posted by kevinb@google.com on 2011-01-12 at 10:58 PM

Understood. May need more motivation. More use cases out there?

Status: Accepted Labels: Type-Enhancement

gissuebot commented 9 years ago

Original comment posted by finnw1 on 2011-01-12 at 11:53 PM

In LINQ (and also Haskell, which I assume is where the names came from) these methods are called Enumerable.TakeWhile and Enumerable.DropWhile (the latter has the predicate inverted)

gissuebot commented 9 years ago

Original comment posted by j...@nwsnet.de on 2011-01-13 at 09:17 AM

Having takeWhile and dropWhile would be great.

However, note that both in Haskell¹ and Python² the first argument is the predicate while the iterable comes second. This is contrary to almost all methods of Iterables. I guess following Guava's style is preferred here, though, to keep things consistent here.

¹) http://learnyouahaskell.com/higher-order-functions#maps-and-filters (search for "takeWhile") ²) http://docs.python.org/library/itertools.html#itertools.takewhile

P.S.: The resource at ¹) contains the aforementioned motivation, too: "We use takeWhile here instead of filter because filter doesn't work on infinite lists."

gissuebot commented 9 years ago

Original comment posted by daniel.yokomizo on 2011-01-13 at 10:05 PM

Actually the documentation there is factually incorrect, what it was supposed to say was "We use takeWhile here instead of filter because we want it to stop searching when it finds a non satisfying element". Both filter and takeWhile will fail to produce all elements of an infinite list (aka stream) but takeWhile will break early if it finds an element that doesn't match anymore. For example, both have the same behavior here:

takeWhile (< 2) (repeat 1) === filter (< 2) (repeat 1)

Also both work on infinite lists, you just can't fold it entirely:

take 100 (takeWhile (< 2) (repeat 1)) === take 100 (filter (< 2) (repeat 1))

The usefulness of take/dropWhile is when we use want to combine basic generators given some restrictions:

takeWhile(filter(isSquare(), aSequence), lessThan(1000))

It'll give us all square numbers less than 1000 from aSequence.

gissuebot commented 9 years ago

Original comment posted by j...@nwsnet.de on 2011-01-14 at 09:32 AM

Thanks for the clarification.

This makes me wonder if there is a way (that makes sense) to protect using e.g. filter or find with infinite iterables in Java. I'm still looking forward for the Range class in Guava, and I assume that it can be open-ended. Is that a common problem to tackle or is leaving getting things right to the developer sufficient?

gissuebot commented 9 years ago

Original comment posted by j...@nwsnet.de on 2011-05-11 at 07:25 PM

Here's another use case for your motivation: From an iterable of items sorted(!) by their date attribute, I'm looking for those before/after a certain date. Hence, e.g.

final long someTimestamp = 4815162342L;
Iterables.takeWhile(items, new Predicate<Item>() {
  public boolean apply(Item item) {
    return item.getCreatedAt().getTime() == someTimestamp;
  }
}

gissuebot commented 9 years ago

Original comment posted by j...@nwsnet.de on 2011-05-11 at 07:28 PM

(Sorry, but I don't get that syntax thing here. Might only work for actual wiki pages. A link to the docs as well as an on-the-fly preview [not that uncommon these days] might help here.)

gissuebot commented 9 years ago

Original comment posted by j...@nwsnet.de on 2011-05-11 at 07:29 PM

Argh, of course it would be item.getCreatedAt().getTime() > someTimestamp; or similar.

gissuebot commented 9 years ago

Original comment posted by cgdecker on 2011-05-11 at 08:18 PM

Issue #619 has been merged into this issue.

gissuebot commented 9 years ago

Original comment posted by j...@nwsnet.de on 2011-06-06 at 02:32 PM

kevinb: Is there a chance limitWhile and skipWhile (or whatever suits best) will appear in r10? Do you need more use cases for these methods I'd consider common?

gissuebot commented 9 years ago

Original comment posted by kevinb@google.com on 2011-07-13 at 06:18 PM

(No comment entered for this change.)

Status: Triaged

gissuebot commented 9 years ago

Original comment posted by cpovirk@google.com on 2011-07-13 at 08:33 PM

(No comment entered for this change.)

gissuebot commented 9 years ago

Original comment posted by fry@google.com on 2011-12-10 at 03:54 PM

(No comment entered for this change.)

Labels: Package-Collect

gissuebot commented 9 years ago

Original comment posted by fry@google.com on 2012-02-16 at 07:17 PM

(No comment entered for this change.)

Status: Acknowledged

gissuebot commented 9 years ago

Original comment posted by kevinb@google.com on 2012-05-30 at 07:43 PM

(No comment entered for this change.)

Labels: -Type-Enhancement, Type-Addition

gissuebot commented 9 years ago

Original comment posted by kevinb@google.com on 2012-06-22 at 06:16 PM

(No comment entered for this change.)

Status: Research

gissuebot commented 9 years ago

Original comment posted by wasserman.louis on 2012-07-02 at 08:49 PM

Issue #1048 has been merged into this issue.

gissuebot commented 9 years ago

Original comment posted by j...@nwsnet.de on 2012-07-30 at 07:52 AM

Any news on this?

I can't provide a new use case, just the one again I've already shown above.

gissuebot commented 9 years ago

Original comment posted by cpovirk@google.com on 2012-08-02 at 09:08 PM

The given examples here seem to fall under the umbrella of "ordered data," for which we can optimize Iterator operations. ("Optimize" might be an understatement in the case of infinite inputs.... :)) If that's typical, then perhaps Comparator makes more sense than Predicate:

Iterables.head(Iterable, Comparator<T>, T) Iterables.tail(Iterable, Comparator<T>, T)

Internally, we once had an OrderedIterator type that exposed operations like these (though not these two in particular, IIRC). It never really caught on, though it was part of the motivation for Iterables.mergeSorted.

I see a handful of results for 'hasNext[(][)] && .compareTo' on a regex search over the Google codebase, plus others that look promising for 'hasNext[(][)] && .[.]._<' and 'hasNext[(][)] && ._peek.*<'.

Am I right to focus on the Comparator case? (Maybe we'll need to revisit the idea of OrderedIterator....)

gissuebot commented 9 years ago

Original comment posted by cpovirk@google.com on 2012-08-02 at 09:12 PM

Ugh, that's going to have a problem. You'll have to write:

head(records, timeComparator(), newDummyRecordWithTime(time))

But you want to write:

head(records, timeOfRecord(), time)

In full generality, that's:

head(records, timeOfRecord(), Ordering.natural(), time)

But perhaps no one will require a non-natural ordering... until someone needs Ordering.natural().reverse()....

A plain Predicate doesn't have this problem -- though it's also less likely that someone will have an appropriate Predicate sitting around than an appropriate Comparator or even Function.

gissuebot commented 9 years ago

Original comment posted by stephan202 on 2012-08-02 at 09:44 PM

The use case of comment #19 can almost be handled using the Ordering class, except that the #min and #max methods that take an Iterable do not accept a default value. I.e., it seems to me one would want to write the following:

Ordering.from(Comparable<T>).min(Iterable<T>, T); Ordering.from(Comparable<T>).max(Iterable<T>, T);

(But I appreciate this comment is slightly off topic.)

gissuebot commented 9 years ago

Original comment posted by markpeters.mail on 2013-05-09 at 08:50 PM

I found myself wanting this today, where it had nothing to do with ordering. I have a case where a Version is backed by a List<Integer> (highest significance first). I need to consider [1, 2, 0] as the same as [1, 2]. This is easy enough for comparisons, but I also want those two Versions to be equal(), which is still easy as it's a comparison. But then I need to implement hashCode and it gets difficult. At that point, I need to be able to normalize.

So I want to be able to say "take until the remaining elements are all 0". In Scala, that would be idiomatically

segments.reverse.dropWhile(_ == 0).reverse

Obviously if this were a Stream-like structure, or a cons list, this wouldn't be efficient. But with LinkedList and ArrayList this is actually really efficient as reverse can be implemented as a live view.

If limit/skip or takeWhile/dropWhile were added to Iterables, this could be done in a similar way. I think the current closest alternative would be something like "reverse, get the first index of the inverse predicate, translate it into a last index of the unreversed list, and then do a subList on the unreversed list."

gissuebot commented 9 years ago

Original comment posted by j...@nwsnet.de on 2013-07-26 at 04:39 PM

I come across uses for limit and skip every once in a while (and ordering doesn't help there).

gissuebot commented 9 years ago

Original comment posted by Maaartinus on 2013-07-27 at 06:39 PM

But then I need to implement hashCode and it gets difficult. At that point, I need to be able to normalize.

You don't.

int hash = 0; for (int i : reverse(list)) hash = 43 * hash + i;

I think the current closest alternative...

IMHO currently anything but a simple loop makes no sense.

while (!list.isEmpty() && list.get(list.size() - 1 == 0) list.remove(list.size() - 1);

guai commented 8 years ago

My common sense tells me, that skip should come with inject for symmetry.

injectBefore(anotherIterableToInject, wherePredicate);
injectAfter(anotherIterableToInject, wherePredicate);

jbduncan commented 7 years ago

I think these takeWhile and dropWhile methods should be added to com.google.common.collect.Streams as well.

Oracle are planning to add Stream.takeWhile and Stream.dropWhile methods in Java 9, so these methods would also act a sort of intermediate measure for those who won't be able to use Java 9 immediately upon or shortly after its release.

orionll commented 6 years ago

Any progress on this issue?

google / guava

Iterables.limit/skip(Iterable<T>, Predicate<? super T>) #477