PanDAWMS / dkb

Data Knowledge Base for HENP experiments
0 stars 2 forks source link

Rework message-handling methods in InputStream and Consumer. #430

Closed Evildoor closed 3 years ago

Evildoor commented 4 years ago

Both InputStream and Consumer perform a similar sequence of actions with messages. However, it is done in a contradictory and confusing manner: in Consumer next() is called from the "upper" levels and calls other methods that perform actual work, while in the InputStream next() is a method that performs work and is called by get_message() which, in turn, is called from "above". This will be resolved by standardizing the methods in the following way (each method calls the previous one):

  1. get_raw_item(): get an item from "lower" level.
  2. get_item(): process the item.
  3. next(): return the item.

Consumer has to be changed in this regard as well - however, without batch processing implemented, it would be simply a matter of changing some names and adding get_item() that does nothing apart from return get_raw_item(). Therefore, I decided to do this later, together with batch processing (out of discussion of which this whole idea arose).

Evildoor commented 3 years ago

The suggested commit is a bit difficult to read, please split it somehow... for example:

  • move "get_raw_item" functionality into a separate method";
  • switch "get_message"/"next" methods;
  • rename "get_message" into "get_item".

Done.

The PR mentions both InputStream and Consumer classes, so it's better to bring the repo to the point where these two are consistent and obey same logic. So, despite the triviality of changes on the Consumer side, please do the second part here as well. Or at least reword the commit(s?) and PR description so that one didn't end up with a feeling that it's just a part of something bigger, and that this "bigger" was left unfinished for whatever reason ;)

Fair enough, updated Consumer.

You decided not to split <message>=<raw_data><EOM> into two raw items (<data> and <EOM>) here, right? Is it planned to be done later, or will be left as-is (which is fine, but, as it was discussed in the batch processing proposal, leads to a mixed logic in the future -- while the PR states that it's fighting against "confusing" things, so...) It can as well be done in a separate PR, but my point here is that "bringing order into message-reading part" includes all these changes, aimed to formalize "items", their usage and handling -- so it all could be done in the same PR. Or, again, the general issue of "contradictory and confusing manner" shouldn't be stated as the (single and standalone) PR's motivation, since it's not fully solved here...

Yes, the logic of markers and messages will be changed elsewhere, be it a separate PR or PR about batch processing. Both terms - "contradictory and confusing manner" and "bringing order" intended to combat it - are rather broad, but the idea of this PR was to deal specifically with two different directions of method hierarchy. Updated PR description to better reflect this.

Changes requested: ...

  • reword method's descriptions (see suggestions below, mb reword them):

    • "message:" -> "raw/stream item",
    • what should be taken as "an item" in each case,
    • "convert" -> "get" or "produce", etc.

Done.

Evildoor commented 3 years ago

Several force-pushes to properly organize commits and add scopes.

Evildoor commented 3 years ago

@mgolosova, suggestions were answered, scopes added, version updated. Therefore, unless I'm forgetting something, the PR is finally ready to be reviewed again.

mgolosova commented 3 years ago

@Evildoor, everything looks alright, so -- approved and merged. Thank you!