petl-developers / petl

Python Extract Transform and Load Tables of Data
MIT License
1.22k stars 190 forks source link

iterrowslice raises `RuntimeError` isntead of `StopIteration` #575

Closed rkulinski closed 2 years ago

rkulinski commented 2 years ago

Minimal, reproducible code sample, a copy-pastable example if possible

https://github.com/petl-developers/petl/blob/master/petl/transform/basics.py#L730

Problem description

For newer Python version (3.7 and more) line from the link results in RuntimeError instead of throwing StopIteration.

See similar case: https://stackoverflow.com/questions/51700960/runtimeerror-generator-raised-stopiteration-every-time-i-try-to-run-app

I think this should be solved like in above SO post.

Version and installation information

Relevant for all versions.

For anyone dealing with that bug we used that workaround in project:

    try:
        next(table)
    except RuntimeError as e:
        if isinstance(e.__cause__, StopIteration):
            raise StopIteration
        raise e
augustomen commented 12 months ago

iterrowslice is but one of the many places (100+ in my counts) where next(it) is called unguarded and could potentially raise RuntimeError. I wouldn't say this issue is completely fixed. Example:

>>> select([], 'True')
Traceback (most recent call last):
  File "petl/transform/selects.py", line 130, in iterrowselect
    hdr = next(it)
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "petl/util/vis.py", line 135, in _table_repr
    return str(look(table))
  File "petl/util/vis.py", line 104, in __repr__
    table, overflow = _vis_overflow(self.table, self.limit)
  File "petl/util/vis.py", line 528, in _vis_overflow
    table = list(islice(table, 0, limit+2))
RuntimeError: generator raised StopIteration

This error only occurs when a iterable class (that is, one implementing __iter__ - like all Views) returns an iterator instance, which then calls an unguarded next(it). Such is the case of dicts([]), namedtuples([]), records([]) and many others. It does not happen with cases where the function is called directly, eg:

>>> rowgroupby([], 'foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "petl/util/base.py", line 698, in rowgroupby
    hdr = next(it)
StopIteration

...which raises a StopIteration - which I think is undesirable but understandable.