To Reproduce
If have a micro-service that reads data from an external source, buffers it, and aggregates and sends it to some other external service. I have added a dummy snippet that is similar to my micro-service, but has the same issue.
The issue only occurs if the timer is triggering the on_next call.
import time
from datetime import timedelta, datetime
from random import random
import reactivex
from reactivex import operators
from reactivex.scheduler import ThreadPoolScheduler
def main():
scheduler = ThreadPoolScheduler(max_workers=4)
reactivex.from_iterable(iterable()).pipe(
operators.buffer_with_time_or_count(
# set the timespan to only 2 seconds so that the timer triggers the on_next
timespan=timedelta(seconds=2),
count=10000,
),
).subscribe(
on_next=on_next,
on_error=print,
on_completed=print,
scheduler=scheduler,
)
time.sleep(1000)
def iterable():
i = 0
while True:
yield i
time.sleep(1 / 10) # input network delay
i += 1
def on_next(data):
print(datetime.utcnow(), data[0], data[-1])
time.sleep(random() * 5) # mock output network delay
if __name__ == '__main__':
main()
Script output
Running the script as is gives something along the lines of:
Describe the bug buffer_with_time_or_count loses data when the timer is the trigger and the
on_next
releases the GIL.Related issue: https://github.com/ReactiveX/RxPY/issues/702, but this issue can get solved by using a scheduler since the
on_next
method does not release the GIL.To Reproduce If have a micro-service that reads data from an external source, buffers it, and aggregates and sends it to some other external service. I have added a dummy snippet that is similar to my micro-service, but has the same issue.
The issue only occurs if the timer is triggering the
on_next
call.Script output Running the script as is gives something along the lines of:
However, if I replace
with
the output now looks like:
Expected behavior
buffer_with_time_or_count
not losing any data. It did not matter which scheduler I used (or none at all).Additional context
4.0.0
3.10.0/3.11.0