astronomy-commons / adc-streaming

Core Python(ic) client libraries and utilities for alert streaming
BSD 3-Clause "New" or "Revised" License
6 stars 6 forks source link

Split reading and writing into two different classes #21

Closed spenczar closed 4 years ago

spenczar commented 4 years ago

Today, genesis.streaming pretty much provides one class: AlertBroker. To use one for writing, you pass a string which contains a"w" character in to the mode argument when constructing it; an "r" sets it up for reading.

Using one class for both reading and writing makes the internal code more complex and confusing, inviting bugs. It doesn't provide value for users, who will virtually always want to either read or write to a stream. And it makes the API more confusing for users.

Examples of the complexity and confusion:

I think that separate Publisher and Subscriber classes would make a lot more sense, here.

myNameIsPatrick commented 4 years ago

I agree, it would make sense to have separate classes for reading and writing. I would also argue that we should separate it further into Kafka-based Producer and Consumer classes which give us the added functionality, and wrap those in the Publisher and Subscriber classes which are more general.

My reasoning for wanting the Producer and Consumer classes as well is that we're extending the current functionality of the stock confluent-kafka classes with a few things:

Consumer:

Producer:

The serializer/deserializer functionality is more debatable whether it's value added at this stage, but the rest could be folded in extending the stock client.

I kind of like the open() functionality though, and like the parallels to the built-in open(), even if in this case the particular mode just routes to the Producer and Subscriber classes. I don't know if it's an appropriate abstraction though.

mjuric commented 4 years ago

+1 on the proposed split, but I'd advocate for keeping the file-like object veneer on top and as the external interface.

My thinking is that the "stream is an infinite file of records" abstraction is easy to get from the library users' PoV, and makes it easy for newcomers to jump into development. I.e., something like:

with open("kafka://ztf.uw.edu", "r") as stream:
    for msg from stream:
        ...do something...

comes naturally to one.

I'm happy with not supporting open(..., mode="rw") -- that was me being overly zealous when I hacked the write portion together and noticed it could be done :).

spenczar commented 4 years ago

@myNameIsPatrick I really like that list of features, thank you! I think that wrapping into Kafka-specific vs general ones is probably more than we need to do right now. We won't know what's general until we have another backend to support. Once we do, then we can provide the generalizing superclass, but if we try to abstract prematurely, we'll probably just get it wrong.

@mjuric I do think that the file-like open is really nice as a convenient constructor, yes. I like that it uses the protocol field of the URL, too, so if we do have different backends, it can just provide the right sort of stream - a amqp:// one, or a vo:// one, or whatever.