i2mint / stream2py

Bring data streams to python, with ease.
https://i2mint.github.io/stream2py/
Apache License 2.0
3 stars 2 forks source link

Review notes #12

Closed thorwhalen closed 2 years ago

thorwhalen commented 3 years ago

None sentinels

Where are we using None as a hardcoded sentinel? What are the distinct cases?

What possibilities are we constraining by using None as the hardcoded sentinel, and what can we offer to the user to get out of these constraints (for example, tool that wraps "real" Nones in some Literal.

source_reader.py

andeaseme commented 3 years ago

None sentinels

SourceReader items must be sortable by the abstract method SourceReader.key and the items must produce a non-repeating key that is greater than the previous item. So with these constraints, None on its own cannot be a valid item. If you wish to use non-sortable or repeating values, you must additionally timestamp or enumerate the value to create a sortable item. For example, a numbered tuple (0, None). Simply put, sentinels are not a concern within stream2py and should be handled outside after decoupling from order keeping.

None sentinels issue is closed

source_reader.py

Instead of minimizing required abstract methods, I would create a helper function to minimize the boiler plate of creating a SourceReader and constructing a StreamBuffer in most cases

def mk_stream_buffer(read, open=None, close=None, info=None, key=None, maxlen=100, sleep_time_on_read_none_s=None, auto_drop=True):

so only read will be required

counter = range(1000)
with mk_stream_buffer(read=lambda: next(counter)) as count_stream_buffer:
    count_reader = count_stream_buffer.mk_reader()
    for i in count_reader:
        print(i)

Handling exc_type, exc_val, exc_tb in exit can be done by overloading the sensible default.

thorwhalen commented 2 years ago

About the sentinels, your comments put me a bit more at ease. I'm still left with (1) a slight apprehension that future developers might rewrite/extend in such a way that breaks this assumption, and (2) a feeling that python doesn't offer any obvious tools to make sentinels or even distinguish various roles that None could have. Would naming the different nones do more harm than good? (that is, NotSpecified = None; NoNewData = None; etc.)

thorwhalen commented 2 years ago

And about the source reader. Yes, your helper solution smells right. Leaving the classes be explicit, pure, low-level, but allowing helper functions to be more flexible.

Let's do that, and make sure that

It should also be clear what the tradeoffs are. They're always the same: Making things more explicit allows for one to reach higher levels of optimality and safety, but more flexible ("Postel") interfaces allow for easier development (at least in initial stages).

andeaseme commented 2 years ago

mk_stream_buffer has been added.