Open jtrakk opened 5 years ago
Would val = first(lst)
work? This would raise an exception if there are no elements. In your application, you can assert len(lst) == 1
The implementation I'm thinking of is
def only(iterable: Iterable[T]) -> T:
"""Extract the only value in an iterable containing a single item."""
[value] = iterable
return value
Looking at the docs for itertoolz, it looks like there is a very careful effort to make all the functions work on iterables of any length. This seems to be a very specialized function and would seem to me inconsistent with the rest of the API. What would be the use case?
all the functions work on iterables of any length
That's a good point.
I find the [value] = iterable
trick comes in handy very often though. For example, I expect there is only one item with id
of 99
.
Usually people handle this by
value = next(x for x in items if x.id == 99)
but this of course masks a bug if there are accidentally two or more matching items.
So the solution is
[value] = (x for x in items if x.id == 99)
but if you've never seen that before, it's not immediately obvious what's going on. Giving it a name and docstring will help:
value = toolz.only(x for x in items if x.id == 99)
I guess I've never had a use case like this come up. Perhaps @eriknw could weigh in?
Thanks for the thoughtful suggestion and discussion. I've been giving this some thought. I've used this pattern a couple times the last couple of years. I usually do the following
value, = items # unpack single item
But it's probably clearer to use @jtrakk's suggestion:
[value] = items # unpack single item
Note that I always include a comment with this operation for clarity. To compare,
value = only(items)
isn't that bad at all. I'm warming up to this. I prefer the name only
.
I've had this come up a couple times again, and somebody said [value] = items
is weird during a code review and needs a comment.
I like the proposed functionality, and I like both suggested names only
and single
. Does anybody have objections to adding this or a preference for the name? I may slightly prefer only
simply because it's shorter.
So, I've actually come across a use case for this. np.ndenumerate on a 1d array. It could be argued that I could use enumerate, however, I like to stick to numpy land when dealing with numpy arrays. The indexes are returned as single value tuples, which for my use, I had to unpack before I could use them. Having something like this would have been really nice to do something like:
idx = ((only(ix), val) for ix, val in np.ndenumerate(arr))
I think we need to clarify proper behavior for failure cases. What should happen in failure cases? If I pass a sequence of more than one item, should an error be thrown? Passing an iterable of three elements, the suggested implementation will eat two values before throwing an exception. Do we want to avoid that?
If I pass a sequence of more than one item, should an error be thrown?
Yes. Otherwise you can just use next(iter(lst))
.
Passing an iterable of three elements, the suggested implementation will eat two values before throwing an exception. Do we want to avoid that?
I don't think it's avoidable. You can't know that there are more items in an iterator until you try to pull the next one out.
In [1]: lst = [1,2,3]
In [2]: it = iter(lst)
In [3]: [x] = it
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-ba741f2ffd3e> in <module>
----> 1 [x] = it
ValueError: too many values to unpack (expected 1)
In [4]: list(it)
Out[4]: [3]
What would a more general unpacking function look like? Essentially, a special case of take
that verifies length and returns values suitable for python's unpacking mechanisms.
# Unpack a 1 element sequence
# value == 1 (this is a special case)
value = unpack(1, [1])
# Unpack 2 elements from sequence.
# a == 1, b == 2
a, b = unpack(2, [1, 2])
# Throw ValueError if there aren't enough values
a, b = unpack(2, [1])
# Also throws ValueError. Too many values. Maybe just drop the extra values?
a, b = unpack(2, [1, 2, 3, 4])
Thoughts?
That's very similar to itertools.islice()
.
@jtrakk you're correct. take
is basically an alias for itertools.islice
in pytoolz. The main addition in unpack would be the length checking.
A possible implementation could be:
def unpack(n, seq):
seq = iter(seq)
rv = tuple(itertools.islice(seq, n))
# check that we have enough elements
if len(rv) != n:
raise ValueError
# check we don't have more values in seq
for el in seq:
raise ValueError
if n == 1:
return rv[0]
return rv
I'm not convinced a generalized version, unpack
, is necessary, because
breakfast, lunch = i_haz_two_cheezburgers
is already concise and readable (and using (x, y) = ...
or [x, y] = ...
is also fine).
I'm +1 for only
, and am open to other names.
Sometimes I expect an iterable to only have one value, and I want to pull it out, failing if there 0 values or more than one value.
One way to do that is
but this is rather implicit and hard to read if you've never seen it before.
Better would be
or