Open eriknw opened 11 years ago
Accepts a default value as per get
, but it will still fail if indices are not non-negative integers:
no_default = '__no__default__'
def iterget(ind, seq, default=no_default):
indices = sorted(zip(ind, itertools.count()))
results = [default] * len(indices)
try:
val = next(seq)
prev_index = 0
for index, count in indices:
if index != prev_index:
n = index - prev_index - 1
val = next(itertools.islice(seq, n, n + 1))
prev_index = index
results[count] = val
except StopIteration:
if default == no_default:
raise IndexError('index out of range')
return tuple(results)
>>> iterget([1, 10, 3, 11, 2], iter(range(10)), default=None)
(1, None, 3, None, 2)
Hi Erik,
Nice code -- it's quite a tough problem to solve without resorting to tee
.
The functions need to be amended slightly to work with iterables, which by definition have no next
or __next__
method.
For example:
>>> iterget([1, 3, 5, 2, 3], iter(range(100, 200)))
(101, 103, 105, 102, 103) # works as expected
>>> iterget([1, 3, 5, 2, 3], range(100, 200))
Traceback (most recent call last):
File "<pyshell#16>", line 1, in <module>
iterget([1, 3, 5, 2, 3], range(100, 200))
File "<pyshell#14>", line 4, in iterget
val = next(seq)
TypeError: 'range' object is not an iterator
If we add one extra line at the start of the function we can fix the problem.
def iterget(ind, seq):
seq = iter(seq) # only change required
indices = sorted(zip(ind, itertools.count()))
results = []
val = next(seq)
prev_index = 0
for index, count in indices:
if index != prev_index:
n = index - prev_index - 1
val = next(itertools.islice(seq, n, n + 1))
prev_index = index
results.append((count, val))
return tuple(item for _, item in sorted(results))
>>> iterget([1, 3, 5, 2, 3], iter(range(100, 200)))
(101, 103, 105, 102, 103)
>>> iterget([1, 3, 5, 2, 3], range(100, 200))
(101, 103, 105, 102, 103)
My view is that get
depends on indexing while take
, drop
, nth
depend on iteration. I'm not too concerned about get
failing on lazy iterators. If we do allow this then the documentations should probably be updated.
Yeah, I was actually thinking about allowing nth
to accept a list of items to return, so it would be analogous to get
but for iterators and iterables in general. Is this appropriate for nth
, or should a new function be defined?
I would prefer extending nth
before making a new function. Do you have a particular use case?
Do you have a particular use case?
Not at present, but I'll try to come up with one, because I can see using this at some point. I thought this was a neat convenience, and intuitive especially when juxtaposed with get
.
If it is decided that being able to pass a list of indices to nth
is a good idea (because it is convenient), I made the following branch:
https://github.com/eriknw/toolz/tree/nth_with_list and the diff https://github.com/eriknw/toolz/compare/nth_with_list
I found a use case for this. In the future I may need to efficiently select certain rows out of a lazily read file. I might be doing this a lot (and in strange ways) so I want it to be simple (hence in toolz).
This might be an interesting case to test how cytoolz
responds to changes in toolz
. I'm not in any rush to do this. I figure I'll wait until after @eriknw releases cytoolz
to PyPI.
It was brought up in #93 that
get
didn't work on iterators, but it could be made to do so. This would act as a generalizednth
that could accept a list of indices to fetch. I thought this was interesting, so I developed the following:and some output:
I don't know if this is the best or cleanest way to do it, and it currently doesn't accept default arguments, but I thought this was neat enough to share.
So, should we allow
get
to work on iterators? Should there instead be a separate function for getting multiple items from an iterator?