Describe the bug
A clear and concise description of what the bug is.
Although maxsize on stages such as map() should limit the queue of items in the stage, it does not work as expected, when multiple map() stages are chained. The first stage is drained immediately.
Note: In addition to that each stage adds two extra items over maxsize.
Minimal code to reproduce
Small snippet that contains a minimal amount of code.
import logging
import threading
import time
import pypeln as pl
# since print() among threads results in wrong ordering
logger = logging.getLogger('foo')
logging.basicConfig(level=logging.DEBUG)
def load(x):
logger.debug(f"{threading.get_ident()} loading {x}")
return x
def process(x):
time.sleep(0.1) # some slow computation
return f"processed {x}"
def show_results(stage):
for result in stage:
logger.debug(f"{threading.get_ident()} result '{result}'")
stage = pl.thread.map(load, range(10), workers=1, maxsize=1)
stage = pl.thread.map(process, stage, workers=1)
show_results(stage)
Output (wrong):
DEBUG:foo:140324684944960 loading 0
DEBUG:foo:140324684944960 loading 1
DEBUG:foo:140324684944960 loading 2
DEBUG:foo:140324684944960 loading 3
DEBUG:foo:140324684944960 loading 4
DEBUG:foo:140324684944960 loading 5
DEBUG:foo:140324684944960 loading 6
DEBUG:foo:140324684944960 loading 7
DEBUG:foo:140324684944960 loading 8
DEBUG:foo:140324684944960 loading 9
DEBUG:foo:140325310869696 result 'processed 0'
DEBUG:foo:140325310869696 result 'processed 1'
DEBUG:foo:140325310869696 result 'processed 2'
DEBUG:foo:140325310869696 result 'processed 3'
DEBUG:foo:140325310869696 result 'processed 4'
DEBUG:foo:140325310869696 result 'processed 5'
DEBUG:foo:140325310869696 result 'processed 6'
DEBUG:foo:140325310869696 result 'processed 7'
DEBUG:foo:140325310869696 result 'processed 8'
DEBUG:foo:140325310869696 result 'processed 9'
Describe the bug A clear and concise description of what the bug is.
Although
maxsize
on stages such asmap()
should limit the queue of items in the stage, it does not work as expected, when multiplemap()
stages are chained. The first stage is drained immediately.Note: In addition to that each stage adds two extra items over maxsize.
Minimal code to reproduce Small snippet that contains a minimal amount of code.
Output (wrong):
If we only have one stage it works well (OK):
If we set
maxsize
in the second stage it limits the queue in the first stage, not the second (wrong):Expected behavior A clear and concise description of what you expected to happen.
maxsize=1
is set on the first stage, the result should look like in the second output (the queue for the first stage should be limited)maxsize=1
is set on the second stage, the result should look like in the first output (the queue for the first stage should not be limited)Library Info Please provide os info and elegy version.
0.4.9
Screenshots If applicable, add screenshots to help explain your problem.
Additional context Add any other context about the problem here.