usmanm77 / morelinq

Automatically exported from code.google.com/p/morelinq
Apache License 2.0
0 stars 0 forks source link

Replace ImbalancedZipStrategy with distinct Zip* methods #24

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The current use of an enumeration (ImbalancedZipStartegy) to express the 
required behavior seems awkward because it's not something you want to be 
able to vary externally, which is usually why you would make a parameter 
or argument. Plus, it's a little cumbersome to write and read back in code 
and not very LINQ-ish in spirit. The enumeration could certainly be used 
internally for sharing the implementation logic but the public API should 
use distinct and clear operator names that imply the strategy. For example:

- EquiZip:
  Sequences must be equal in length otherwise throw error

- Zip: 
  Truncates to shorter of the two sequences

- ZipLongest: 
  Uses longer of the two sequences while padding shorter of the two

There is a very subtle yet major idea behind extension methods and LINQ 
that the enumeration also works against. If the enumerable source type 
wants to provide an optimization for an operator, it can do so by 
providing a method by the same name and signature as the operator in 
question. If we use an enumeration that comes from MoreLINQ then the other 
type has to take in a dependency that won't be looked upon lightly. If we 
take the approach of embodying the strategy in the name (assuming this
is a compile-time decision), then we have simple signatures with Zip, 
EquiZip and ZipLongest being distinct and clear names. Now, if there is a 
type called SuperDuperList<T> that wants to provide an optimization for 
Zip, EquiZip and ZipLongest (or any one of the three) then one can do so 
using simply base and generic types. Right now, with the enumeration, 
there is no choice but to support all strategies and take a hit on 
MoreLINQ! This is how Enumerable.Contains works. When using Contains on 
variables typed as List<T> or IList<T>, the LINQ extension method is not 
used! If one wants to force use of the LINQ's Contains implementation then 
one has to hop over AsEnumerable first.

This approach keeps MoreLINQ close to how LINQ operators should be 
designed, taking built-in ones as guidance.

P.S. This issue was creation out of comments in issue #6.

Original issue reported on code.google.com by azizatif on 7 Apr 2009 at 7:35

GoogleCodeExporter commented 9 years ago
Implemented in r87.

Original comment by azizatif on 7 Apr 2009 at 7:38

GoogleCodeExporter commented 9 years ago
Okay, I'm on board with the idea of three methods. I'm not sure they're the 
right 
names though. In particular, *not* coming from a Python background, I'd expect 
an 
error from just "Zip" with unequal lengths.

How about:

Zip - throw error
ZipTruncate - stop at end of shorter sequence
ZipExtend - extend shorter sequence with default values

Original comment by jonathan.skeet on 7 Apr 2009 at 7:56

GoogleCodeExporter commented 9 years ago
I've changed the summary to not have the method names in it. As a result, this 
issue 
was just about having ImbalancedZipStrategy live in shadow of distinct method 
name 
that embody the strategy. Give that, I'm closing it again ;) because that bit 
is 
done.

@jonathan.skeet: May be open three new issues along the lines of:

- Consider renaming Zip
- Consider renaming EquiZip
- Consider renaming ZipLongest

Original comment by azizatif on 7 Apr 2009 at 8:57