Closed spenczar closed 4 years ago
This gets used to provide a way to update progress bars and alike.
The logic is as follows:
HEARTBEAT_SENTINEL
so the upstream routines know what was returned isn't a message but a notification that the wait timed out: https://github.com/astronomy-commons/genesis-client/blob/master/genesis/streaming.py#L247_filered_stream
should then return None
if the message is a HEARTBEAT_SENTINEL
, but I now notice it's missing an else
clause to actually do it: https://github.com/astronomy-commons/genesis-client/blob/9a39a46f6ae724d2bd00d1ab5e4d10819e472ddc/genesis/streaming.py#L261
I think this is a bug that got introduced when we were introducing the ability to return message metadata. See what it used to look like here: https://github.com/astronomy-commons/genesis-client/blob/6a8348a5504a281709ca18f40a303038701556c4/genesis/streaming.py#L207None
, and if it is, it's ignored except for allowing the progress bar to update. This also got broken in the rewrite -- see here: https://github.com/astronomy-commons/genesis-client/blob/6a8348a5504a281709ca18f40a303038701556c4/genesis/streaming.py#L268
for how it should work (calling t.update(0)
on tqdm
makes it update the number of seconds it's been running).Maybe a better subject for this Issue would be to restore progress bar functionality?
I'd like to design a better progress bar API than calling tqdm
directly. This feels like the wrong direction for the dependency to flow - we make a lot of assumptions by directly calling tqdm. The caller could provide us with something that we should update with progress information; that could be done with something other than the heartbeat-through-the-iterator design here, maybe?
That's based on my understanding that this library is relatively low-level. The caller will be other code like SCiMMA's hop
- that would be the place where tqdm would be chosen.
I should add - thanks a lot for the walkthrough! I think step 1 is still to remove the heartbeat, and then step 2 is to redesign progress bar stuff, possibly adding heartbeats back in (but hopefully not?).
Sounds good. The progress API does need to be generalized -- the tqdm
stuff was an expedient hack (so was the shameful hardcoding of many constants -- e.g., 1 second for timeout in c.consume(...)
, etc.).
That said, making it easy to show progress makes for a world of difference in usability (even for a relatively low-level library). We should brainstorm how to get a good generalized equivalent of for msg in stream(progress=True): ...
, but there should be one.
This was done in the v1.0.0
release.
The HEARTBEAT_SENTINEL logic doesn't really do anything right now. It all happens in pure-python, and doesn't reflect any health state about the Kafka broker. I think it can be removed entirely.