static-frame / frame-fixtures

Use compact expressions to create diverse, deterministic DataFrame fixtures with StaticFrame
Other
8 stars 0 forks source link

StopIteration when creating a large frame #1

Open ForeverWintr opened 3 years ago

ForeverWintr commented 3 years ago

I ran into this while trying to build a large frame for performance testing. I think it's a bug?

f1 = ff.parse('s(200000,4)|i(I,int)|c(I,str)|v(str)')   

Raises builtins.RuntimeError: generator raised StopIteration.

flexatone commented 2 years ago

Many thanks for isolating this issue. I cannot, however, reproduce it; might there be some context missing in how you were using it?

I have added a test repeating this same scenario: https://github.com/InvestmentSystems/frame-fixtures/blob/7b8edc4c20c430d4818e6b9f5d527c8b71d07f41/frame_fixtures/test/test_core.py#L176

ForeverWintr commented 2 years ago

Interesting! I am still able to reproduce it. Here is the full traceback:

import frame_fixtures as ff
ff.__version__
# '0.2.0'

f1 = ff.parse('s(200000,4)|i(I,int)|c(I,str)|v(str)')   

Raises:

Traceback (most recent call last):
  File "/home/rutherford/.env38/lib/python3.8/site-packages/frame_fixtures/core.py", line 622, in gen
    yield SourceValues.dtype_spec_to_array(
  File "/home/rutherford/.env38/lib/python3.8/site-packages/frame_fixtures/core.py", line 382, in dtype_spec_to_array
    return cls.dtype_to_array(np.dtype(dtype_spec),
  File "/home/rutherford/.env38/lib/python3.8/site-packages/frame_fixtures/core.py", line 359, in dtype_to_array
    array = np.array([next(gen) for _ in range(count)])
  File "/home/rutherford/.env38/lib/python3.8/site-packages/frame_fixtures/core.py", line 359, in <listcomp>
    array = np.array([next(gen) for _ in range(count)])
builtins.StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  Python Shell, prompt 2, line 1
    # Used internally for debug sandbox under external interpreter
  File "/home/rutherford/.env38/lib/python3.8/site-packages/frame_fixtures/core.py", line 730, in parse
    return Fixture.parse(dsl=dsl)
  File "/home/rutherford/.env38/lib/python3.8/site-packages/frame_fixtures/core.py", line 708, in parse
    tb, index, columns = cls._to_containers(constructors, str_to_type)
  File "/home/rutherford/.env38/lib/python3.8/site-packages/frame_fixtures/core.py", line 660, in _to_containers
    tb = cls._build_type_blocks(
  File "/home/rutherford/.env38/lib/python3.8/site-packages/frame_fixtures/core.py", line 627, in _build_type_blocks
    return str_to_type['TB'].from_blocks(gen()).consolidate()
  File "/home/rutherford/.env38/lib/python3.8/site-packages/static_frame/core/type_blocks.py", line 133, in from_blocks
    for block in raw_blocks:
builtins.RuntimeError: generator raised StopIteration
flexatone commented 2 years ago

Very strange. Can you try a few different size beyond 200k to see if it always fails over that threshold?

ForeverWintr commented 2 years ago

Interestingly, I cloned the repo and ran the test you added, and it passes for me!

I am however able to reproduce the issue by changing the test:

def test_large_a() -> None:
    import frame_fixtures as ff

    f1 = ff.parse('s(200000,4)|i(I,int)|c(I,str)|v(str)')
    assert f1.shape == (200000, 4)
ForeverWintr commented 2 years ago

Actually, I take that back. I reverted my change and now the original test is failing for me too. I am very confused.

ForeverWintr commented 2 years ago

I have tried with 20_000 and 2_000_000. So far only 200_000 fails.

flexatone commented 2 years ago

Very strange!