Review pubsub architecture

Should we publish the current pubsub module with Wax or address some of the concerns @WhoSoup had?

Main discussion around pubsub module architecture between @WhoSoup @stackdump @Emyrk

Who

is there a reason the pubsub stuff is so incredibly complex? it feels overengineered where a simpler approach with a much easier syntax would work. something like a global registry for paths, and anyone can just publish to it, and you create readers like single reader, or round robin/threaded reader with x instances

the readme mentions there's a need for more complexity but I'm not sure what a use case would be that doesn't work with a simple approach

Matt York

I don't have a great answer to that - would love to see an alternative implementation One use case was to attempt to provide a common mechanism to dispatch n events to m consumers Also looking forward the plan was to be able to scale out - first with goroutines ”threads”

Then perhaps multi-process w/ next iteration One thing we were thinking of is how to re-balance shards dynamically

Who

i wrote a quick project of what i thought it would be like: https://github.com/WhoSoup/factom-pubsub there are some examples of usage, but basically you have channels, and every channel has exactly one reader and one writer. you create the channel, then register it with a path (though that's not strictly necessary). localchannel = just a golang channel, which would be expanded later with a "remote channel" writing is always thread safe. in order to get multiple readers, you wrap the channel reader in a roundrobinreader/multireader when needed it's not completely fleshed out yet but it should be enough for insight and commentary

Matt York

It was built to be overly versitile IMO @Emyrk built out the module before we needed to use it chicken/egg problem .

the core part of the design that seems to be understood : Intention was to decouple modules while refactoring for clarity

And make sure that pub/sub design aligns with sharding design It's newly introduced in wax - never been deployed low risk rip and replace as the whole app needs it to work

Emyrk

Who The reason we added more complexity was to create standard ways to handle consuming of the pub/sub.

The thing about creating a simple pubsub where everything is pretty much just a channel, is that each subscriber and publisher has to implement likely the same logic as many other pubs and subs. A simple example we came up with was called a value subscriber. This is a subscriber doesn't really care about updates and only cares about the current value. This is someting like the api and the controlpanel, who just care about the current height when they are queried, and therefore really like want a variable with a Get() call. Some others we came up with: Pubs

When a subscriber joins, they might want some initial value or the last update. So there is a publisher that will resend the last value it sent to any subscriber that joins after that message has been sent.
MsgFilter is a round robin based on the msg hash. Subs
Counter: This was mainly for testing, but only cares about the number of messages passed, and not the content of them
Hooks for callbacks when an update is received, and context hooks for properly handling the lifecycle of a subscriber

We noticed in discussions, there is numerous ways to handling both the publishing and subscribing of the pubsub, and if we go the simple channel route, we'd be re-implementing the same logic at every package. So I came up with the thought that the publisher only knows his side of the configuration, and the subscriber only knows theirs. Each half can implement whatever logic they want around the pubsub in the form of a wrapper. They just wrap the existing functionality with whatever they needed, and this allowed for combinations of things. Like a subscriber could wrap the MsgFilter, the Context, and the hooks. Or just choose 2, or just 1. I didn't know how many "types of subs/pubs" we were going to need, so tried to create a system of building blocks where you can build what you need.

Emyrk

Now this was all built mostly before any modules were built and there was another large reason why the wrapper approach was chosen. At the time (before we wrote pubsub and most modules), we wanted to have static type checking for our pubsub. It would be ideal if the pubsub types could be checked at compiletime to make sure someone doesn't stick the wrong type down the pub or expect the wrong type on the sub. Everything in the pubsub was interface{}. So we had the idea that you would just wrap your endpoints to enforce the correct usage. This had problems, and was something we were trying to figure out before development on wax went stale.

If you rewrite the pubsub to only allow for 1 type of publisher/subscriber, you are just going to have to implement everything we did specific to each use case in the module itself. The round robin is not sufficient, as sometimes you want a round robin based on msg hash. So either the general publisher needs to know how to handle it, or the publisher needs to expose the individual subscribers for the module to do the round robin themselves. We wanted a standard interface, "Write()" and "Read()" and have all the functionality defined elsewhere. That is why the building block approach was chosen and the "complexity" is the actual building of the pub/sub you need.

I did anticpate the need for more pub/sub types to be needed, but I could have been wrong. I am unsure as we did not implement a huge number of modules. I would also point out most modules are not done, and were implemented so we could "get things working" to implement the modules... Ok rephrased. When we started to make X, we had to stub out so many endpoints it wasn't very indiciative of how things would work. So we started to flesh out the stubs, but they were made at different times and hence the inconsistencies. The whole thing was very chicken and the egg.

Who

first of all, thank you very much for expanding on the subject and i agree, it's not as easy as it first looked. i've had the weekend to think and spent today looking at how deeply integrated the pubsub module already is. i agree on the type-checked endpoints and if i were to rewrite the implementation, that would still be a part of it. if golang already had generics implemented, this would be so much easier...

are there any concrete use cases for things like the roundrobin/msgsplit channels? i assume it was something like VMs / database sharding but i think that's going to run into issues once you try to implement the pubsub over the network. you need an additional management layer between that to manage the shards itself. for example if you have 1 vm per shard and run a msgsplit, you'd need to identify each remote connection and have a 1:1 association with the right index in the msgsplit. if the network dies and all connections are rebuilt, you have significant code overhead to once again correlate the right shard with the right msgsplit index

it feels like the approach of specifying on the publisher end what type of distribution it is will create more problems than it solves down the road. for something like a msgsplit channel, it would make much more sense to have a "ShardManager" that is subscribed to the "/Msg" channel, and then manages the shards by opening a new Pub/Sub channel for each shard (ie /Msg/Shard1, /Msg/Shard2, /Msg/Shard3). Then each individual shard-channel is independent and can have mixed local/remote types without needing to implement mixed types in the pubsub module Who06/15/2020 as it stands, it works fine for an all local node but if it has to be rewritten anyway to manage remote connections, i'd rather do it right the first time. for tcp, you'll only ever have 1:1 connections, so it's a reasonable restriction to build pubsub around being 1:1 as well in terms of carrying data. if you have something like a worker pool, where you are fed data and want to process it in parallel, it doesn't make sense for the publisher to be the one to bear that responsibility, you can tack it onto the reader there is also some really unwieldy code such as: https://github.com/FactomProject/factomd/blob/f8842079a4c464755cca5f16aaf7bfa428bef9ae/modules/pubsub/registry.go#L112-L126 which seems to guarantee that you can't have a multiwrapper somewhere in the middle of a wrapper chain, which imo also makes more sense on the subscriber end

Who

(also generally, it would be better to base pubsub on BinaryMarshallable, rather than interface{}, though i suppose there might be some pubsubs that will never be networked)

Matt York

We intentionally didn't make the leap to networked pub/sub b/c it was generally believed this design would lead to better efficency for a single go proc (standalone follower no sharding).

I've never really seen conclusive profiling that affirms this - and we've had other convos about what might be holding performance back. (on Who's blog post)

Emyrk

Who The msg split was being used to filter for the dependent holding queue and a validation layer. So the use cases we were addresses didn't actually care which data they received as dependencies were handled as other pubsub subscriptions. So the MsgSplit was not for sharding, but purely from a horiztonal scaling of our validation and holding logic.

As for that example line, yea the wrapper stuff started to become a bit more challenging than I had anticpated to maintain and that was a result. The issue with arbitrary wrapper is ensuring they all "play well" together. I was taking inspiration from a message bus system called DDS when I was thinking about this problem. The pubsub system in general was assumed to be easy, but after playing with it, I don't think it is easy. Especially as we go over networks and have to do with those headaches.

Who oh another example code that threw me off was: https://github.com/FactomProject/factomd/blob/6ec9a6218af787b86821f6df3b1f5f85497a504a/modules/pubsub/examples/threadpool/main.go#L37-L39 -- it only shows up once in that example but that's something that also stems from the front-managed distribution and that makes sense. in that use-case there's no need for management

FactomProject / factomd

Review pubsub architecture #1016