yangxu998 / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

New feature: Iterators.interleave() [or Iterators.zip()] #677

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
First reported in guava-discuss list: 
http://groups.google.com/group/guava-discuss/browse_frm/thread/da433e0b7db8c8ae

Be able to combine iterators but instead of moving through elements one at a 
time, pull from each iterator in turn until all are exhausted. Iterators should 
not have to be the same size. Sample code + output of how it might work:

Code:

List<String> a = [ "one", "two", "three", "four" ];
List<String> b = [ "fee", "fi" ];
List<String> c = [ "broccoli", "tomato", "potato" ];
List<String> d = [ "purple" ];

Iterator<String> interleaved = Iterators.interleave( 
    a.iterator(), b.iterator(),
    c.iterator(), d.iterator() );
int count = 1;
while ( interleaved.hasNext() ) {
   System.out.println( count++ + ": " + interleaved.next() );
}

Output:

1: one
2: fee
3: broccoli
4: purple
5: two
6: fi
7: tomato
8: three
9: potato
10: four

Original issue reported on code.google.com by chris.wi...@gmail.com on 5 Aug 2011 at 7:24

GoogleCodeExporter commented 9 years ago
We can also see this like that :

Iterator<List<String>> interleaved = Iterators.interleave( 
    a.iterator(), b.iterator(),
    c.iterator(), d.iterator());

Output:

["one",   "fee",    "broccoli", "purple"]
["two",   "fi",     "tomato"]
["three", "potato"]
["four"]

and maybe padded with null as well ? Like partition but with multiple inputs 
iterator.

Original comment by amer...@gmail.com on 5 Aug 2011 at 9:58

GoogleCodeExporter commented 9 years ago
I think that output would be fine since it can be sent to a separate 
Iterators.concat() method to flatten it to a single list.

Original comment by chris.wi...@gmail.com on 6 Aug 2011 at 4:16

GoogleCodeExporter commented 9 years ago
Looks interesting to me. Can you supply a use case or two?

However, the name "zip" seems to be generally understood as something else 
(namely pulling a single element from all iterators at once and returning a 
tuple whose length equals the number of iterators). [I see this has already 
been pointed out on the mailing list.] "Interleave" is fine, though.

Original comment by j...@nwsnet.de on 16 Aug 2011 at 8:04

GoogleCodeExporter commented 9 years ago
A use case is when you retrieve some data from a third party which come in many 
list of single element which are related to each other. And then you need to 
process these data and you find yourself iterate over many collection which is 
not very logic and readable. The thing you really need is to interleave these 
data to have them close to each other.

Example :

names = ["robert", "paul", "mike"]
ages = [25, 32, 45]

interleave(names, ages) produces :
[
["robert", 25],
["paul", 32],
["mike", 45]
]

so now you can use a predicate to filter people which age is under 30.

Original comment by amer...@gmail.com on 16 Aug 2011 at 12:06

GoogleCodeExporter commented 9 years ago
> A use case is when you retrieve some data from a third party
> which come in many list of single element which are related to each other.
An iterable of single-element iterables is easily pre-processed via 
`Itera[tor|ble]s.concat`.

> names = ["robert", "paul", "mike"]
> ages = [25, 32, 45]
> interleave(names, ages) produces :
> [
> ["robert", 25],
> ["paul", 32],
> ["mike", 45]
> ]
This is exactly what `zip` commonly does.

I agree that the output example in comment #1 is a useful intermediate step to 
return before serializing the elements of all result tuples.

Also, what is the preferred result type? Guava still has no implementation of a 
n-tuple or even pair and triple concept. Immutable lists would be ok, but 
they'd work only for immutable elements. Iterators might also work (like 
returned by `itertools.groupby` in Python), but are not that comfortable to 
work with for direct/indexed element access.

BTW, this seems quite similar to `partition`, but somewhat the other way round.

Original comment by j...@nwsnet.de on 16 Aug 2011 at 1:20

GoogleCodeExporter commented 9 years ago
@amer: names and ages have to have compatible type, in your case it would be 
just Object. The way interleave is proposed to work won't enable you to filter 
people easily. For your use case it is more reasonable to either create a 
Person class and create an iterable of people by iterating over names and ages 
and instantiating Person objects from them. Other choice is to do the same 
without a new class, by using a Map.

This is classic example of Java failing at tuples. Interleave makes sense only 
for compatible types - e.g. same type as in the example with String above. I am 
not sure a reasonable use case for this exists. If Iterables contain related 
data, you should combine them in an entity. If the interleaved order is 
important, that should be handled by Ordering, but I cannot imagine when such 
an Ordering would be useful.

Original comment by gscerbak@gmail.com on 16 Aug 2011 at 1:44

GoogleCodeExporter commented 9 years ago
@yo.gi: Look at this:

Pair in JDK 1.6 - java.util.AbstractMap.SimpleEntry<K,V> and 
java.util.AbstractMap.SimpleImmutableEntry<K,V>

Pair in Guava r09 - com.google.common.collect.Maps#immutableEntry

Triple in Guava r09 - com.google.common.collect.Tables#immutableCell

Original comment by gscerbak@gmail.com on 16 Aug 2011 at 2:12

GoogleCodeExporter commented 9 years ago
Those are specializations of a tuple (as suggested by literature), but they 
have different semantics than what we need here (no maps, no tables). Reusing 
them would violate the principle of least surprise, IMHO.

Original comment by j...@nwsnet.de on 16 Aug 2011 at 3:32

GoogleCodeExporter commented 9 years ago
@yo.gi: You are right, they can be pragmatically used as tuples and triples, 
but they were not intended to be, their semantics - meaning - is different. The 
problem is, that in Java, it is probably not possible to do anything more 
sensible.

Original comment by gscerbak@gmail.com on 16 Aug 2011 at 5:29

GoogleCodeExporter commented 9 years ago
@gscerbak : OK, this is not very pretty to mix types but right now, I'm doing 
presentation using JSP with expression language which does not really care 
about type.
But I like the way you see it with Ordering. I'll think about it.

Original comment by amer...@gmail.com on 16 Aug 2011 at 5:41

GoogleCodeExporter commented 9 years ago

Original comment by kevin...@gmail.com on 1 Sep 2011 at 5:43

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
May I suggest these signatures:

class Iterators {
    public static <T> Iterator<T> interleave( Iterable<Iterator<T> ) { ... }
    public static <T> Iterator<Iterator<T>> transpose( Iterable<Iterator<T> ) { ... }
}

I think 'transpose' would be a better name for the functionality originally 
asked for in this issue. An Iterable<Iterator<T> can be seen as a jagged, 
sparse matrix and the name 'transpose' suggests that the result is another 
matrix, just with rows and columns swapped.

That leaves the name 'interleave' free for the equivalent of Iterators.concat( 
Iterators.transpose( x ) ). Note that I am not sure that interleave should be 
implemented that way (I have an implementation and think that it shouldn't) it 
just illustrates the relationship between the two methods.

Original comment by han...@eyealike.com on 28 Oct 2011 at 7:07

GoogleCodeExporter commented 9 years ago
For the record: "jagged, sparse matrix" makes me totally think of Guava's table 
implementation.

Original comment by j...@nwsnet.de on 29 Oct 2011 at 6:40

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 10 Dec 2011 at 4:14

GoogleCodeExporter commented 9 years ago
Related: http://stackoverflow.com/q/9200080/869736 .

Original comment by wasserman.louis on 12 Feb 2012 at 5:39

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 16 Feb 2012 at 7:17

GoogleCodeExporter commented 9 years ago
Do I understand correctly that this issue is now basically about transpose?  
Issue 203 pretty conclusively eliminated the possibility of Pair.

Original comment by wasserman.louis on 23 Feb 2012 at 7:27

GoogleCodeExporter commented 9 years ago
How do we feel about interleave?  I'm not sure how I feel about transpose, but 
interleave is simple and has broader utility.

Original comment by wasserman.louis on 7 Mar 2012 at 12:17

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 30 May 2012 at 7:43

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 22 Jun 2012 at 6:16

GoogleCodeExporter commented 9 years ago
We needed this functionality for creating test data

Original comment by car...@medallia.com on 26 Dec 2012 at 8:55

GoogleCodeExporter commented 9 years ago
I have an implementation of this. I wonder if submitting the code for review is 
worthwhile given the age of this issue...?  Is the feature considered too 
esoteric for inclusion into Guava?

Original comment by kgil...@gmail.com on 26 Jan 2014 at 1:28

GoogleCodeExporter commented 9 years ago
FWIW my implementation is available here under the MIT-style license: 
https://github.com/iheartradio/interleaver

Original comment by kgil...@gmail.com on 6 Mar 2014 at 10:32

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:15

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:18

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:09