ezmsg-org / ezmsg

Pure-Python DAG-based high-performance SHM-backed pub-sub and multi-processing pattern
https://ezmsg.readthedocs.io/en/latest/
MIT License
15 stars 6 forks source link

Enhancement (Maybe bugfix?): Republisher support for OutputStreams in Collections #11

Open griffinmilsap opened 1 year ago

griffinmilsap commented 1 year ago

Now that we have a 1:1 correspondence between Publishers and OutputStreams in Units, and the ability to set Publisher specifics from the OutputStream initializer, it is now a little unintuitive and surprising that OutputStreams in Collections do not actually result in the creation of a Publisher that we could subscribe to from another process. In particular, if a user has an OutputStream as part of a Collection that they want to expose on a public port (e.g. by defining the host or port keyword arguments for the collection OutputStream) -- currently, no Publisher will be created which is confusing behavior at best (and maybe a bug at worst). The only time Publishers and Subscribers are created are when there is an OutputStream or InputStream defined as part of a Unit.

Adding a "Republisher" Unit at the collection level for each collection OutputStream could alleviate this confusing behavior, but at the expense of occupying more sockets and performance degradation. A Republisher decorator or hint of some sort might allow a user to explicitly define when an OutputStream is republishing a topic could be a beneficial feature addition, as well as a warning when it appears that a user is attempting to interact directly with Collection level Streams. Currently, this is a bigger issue with OutputStreams because Subscriber objects have no configurable keyword arguments, but that's subject to change (future stream type enforcement/message filtering?) so it may be that republishers are required for all collection streams (performance regressions be damned).

griffinmilsap commented 1 year ago

@pperanich and @hannahgooden; I'd appreciate your thoughts on this when you get a chance.

griffinmilsap commented 1 year ago

lol oops -- reopening

griffinmilsap commented 1 year ago

Another possibility I'm entertaining is that Collections can have InputTopics and OutputTopics in addition to InputStreams and OutputStreams. If Streams are defined, there's a republisher associated with them (and associated functionality throughout), but the current behavior is maintained if there's only a Topic defined. Topics would have no arguments and just act as topics in the graph, with zero performance impact for composition.