jOOQ / jOOL

jOOλ - The Missing Parts in Java 8 jOOλ improves the JDK libraries in areas where the Expert Group's focus was elsewhere. It adds tuple support, function support, and a lot of additional functionality around sequential Streams. The JDK 8's main efforts (default methods, lambdas, and the Stream API) were focused around maintaining backwards compatibility and implementing a functional API for parallelism.
http://www.jooq.org/products
Apache License 2.0
2.09k stars 168 forks source link

Add Seq.skipLast(int n) and Seq.limitLast(int maxSize) #313

Closed tlinkowski closed 4 years ago

tlinkowski commented 7 years ago

I propose Seq.skipEnd(int n) and Seq.limitEnd(int maxSize) because I just came across the need to skip the last element from the Seq and couldn't do that easily (i.e. without resorting to Seq.reverse().skip(1).reverse() which is extremely inefficient).

They would correspond to skip(long n) and limit(long maxSize) but they would be int-based instead of long-based. I mentioned these methods in #241 but didn't create a separate issue then:

These methods could be implemented with a circular buffer of size maxSize/n (e.g. an ArrayDeque could be used for this purpose). I guess it'd get harder if we wanted to have longs instead of ints as the arguments.

PS. Alternative API approach: allow negative values in current Seq.skip and Seq.limit methods - a negative value would mean "count from the right".

lukaseder commented 7 years ago

Thanks for your suggestion. Hmm, I'm not so sure if I can see the use-case here. Would you mind describing it? Also, do you see an implementation that would be much better (i.e. better complexity) than the one you already proposed?

tlinkowski commented 7 years ago

Well, general use-case is analogous to normal limit and skip :) Of course, perhaps the most common use-case of normal limit and skip is "pagination" (like in SQL). But apart from that, they have normal usages like "I don't want first 5 elements" or "I want only first 5 elements". So why shouldn't you be able to say "I don't want last 5 elements" or "I want only last 5 elements"? BTW. maybe the name should be limitLast and skipLast then.

As to my specific use-case - I got a Seq, and needed to verify all elements except the last. Finally, however, the design changed and it turned out I needed to verify all elements except the first which is straightforward.

As for the implementation - as far as I understand no better implementation is possible (not even slightly). In order to make a decision (whether to begin outputting - in case of limitEnd, or whether to stop outputting - in case of skipEnd) you must know maxSize/n elements beforehand, and to know them means to buffer them.

Actually, I'm a bit surprised about your complexity concerns. As for the memory requirements, an operation like partition(n) or splitAt(n) requires caching entire Seq no matter the n. And skipEnd(n)/limitEnd(n) require caching up to n elements only. But maybe you're concerned about something else than memory - if so, please, specify because I cannot see any other problems here.

lukaseder commented 7 years ago

Well, general use-case is analogous to normal limit and skip

Well, of course the concept is analogous, but that doesn't mean there's an important enough use-case :)

Finally, however, the design changed and it turned out I needed to verify all elements except the first which is straightforward.

That's what I mean. Ultimately, last/first are interchangeable concepts. In an actual list (with random access), the idea of accessing the "last" elements isn't such a bad idea, but in a stream, it seems unreasonable to do all the work of skipping elements...

But OK, perhaps we can leave this decision to the user. Or, we start fiddling with the implementation to figure out if the backing store is a random access store and implement these methods more efficiently in that case. (I think the JDK Stream does things like that)

maybe the name should be limitLast and skipLast then.

Yes, because we already have findLast(), for instance. I'll rename the issue. Also, for consistency reasons, we'll need to alias limit with take and skip with drop.

By the way: I agree that the existing limit/take and skip/drop methods should allow negatives for convenience. I've always found this feature very useful when manipulating strings.

as far as I understand no better implementation is possible (not even slightly).

Yes, your solution of buffering maxSize elements is certainly better than reversing the entire stream twice. E.g. if we call limitLast(5), then that's a relatively cheap thing to do.

But maybe you're concerned about something else than memory

I simply hadn't thought about better algorithms at first. The workaround (double-reversing) is O(N), whereas the better solution is O(maxSize), which is also linear, but the best case/worst case ratio is much better.

OK, let's do this :)

tlinkowski commented 7 years ago

Well, if we could get to the backing store somehow and access it randomly, it'd be very nice. But I can't find anything like that in entire java.util.stream.

By "let's do this" I understand I can provide a PR so I'll summarize:

lukaseder commented 7 years ago

By "let's do this" I understand I can provide a PR

Yes, that would be great! :)

I mostly agree with 1/2.

For consistency reasons, I'd prefer to use long rather than int and throw the exception you suggested. Perhaps, in the future, there will be a solution to support long as well and I wouldn't want to change API for that.

Should I also add support for negative numbers in limit(long) and skip(long)?

I'd like that, indeed. But perhaps, let's do that in a separate issue. It will be easier to track the changes.

lukaseder commented 4 years ago

I'm not sure we need this in the API. Closing as won't fix.