Closed Wondertan closed 2 years ago
Is this actually necessary? The msgID function has access to the message and can look at the topic. So the desired functionality is already there.
Note that there is no inherent need for ugly switching, one can use a map with a readonable default.
Imagine there are two completely different protocols relying on PubSub. One of them needs custom msg id generation and another one is not. The application which relies on these two protocols and creates a PubSub instance would need to be aware of the protocol internals and import their msg types to do the switch. So the ugliness comes not from the switch itself but from the ugly architecture requiring the application to know the internals. Sorry for the confusion.
The point is to allow msg id generation to be application/protocol/layer specific.
In my case, I need to exclude one non-deterministic field of a structure from id generation with blake2b, but I don't want to have one global MsgIdFunction
importing all msg types from all the protocols to switch over them.
Well ok, but it complicates things.
How?
The right way to do it internally is to have one msgid function, which dsipatches on a map of funcs set potentially per topic and a default that can be overwritten as is now with the option.
this way the internal changes will be minimal, and we dont break existing applications.
The solution I proposed keeps the logic and API the same and still allows customization. If WithCustomMessageIDFunc
is not given for a topic, then the topic uses the global default or opted one, so nothing should break.
In other words,WithCustomMessageIDFunc
optionally overrides global MsgIdFunction
only for a specific topic.
That's not very elegant i think, it makes users reinvent the wheel and messy internals. We can do better.
Lets create a struct type that has a map and a default, with a RWMutex. It also has a method for computing the msgid, which caches and we pass it where we otherwise use the msgid func. Lets also add the topic option, which mutates the map with exclusive lock.
This way nothing changes in the external interface and on the same time we get a maximally flexible solution. The extant msgid option would set the default.
Btw, for future reference, it is preferrable to discuss design in an issue and reach consensus before opening the pr, so that you dont do unnecessary work.
I am in 👍🏻
We can do better.
Thank you for the detailed description. Qs:
map[string]map[*pb.Message]string
? // topic -> msg -> id
ID
field on the Message(wrapper). Is this bad and/or introduces messy internals
? IMHO, adding ID
to the Message(wrapper) is clean, but I might be missing something.processLoop
? Although in practice map mutates would happen only on application start as rarely topics are joined in the runtime, so this is not a concern.msgIDGen
as the name for the struct?Lets also add the topic option, which mutates the map with exclusive lock.
Agree and I wanted to write a proposed solution but it turned out to be easier to implement it instead of describing it in English for this case :)
Btw, for future reference, it is preferrable to discuss design in an issue and reach consensus before opening the pr, so that you dont do unnecessary work.
map[string]func(*pb.Message)string
. Basically a map of topic names to functions that compute their msgid. The entry point in the object will be a func (m *Message) string
method, which does the following:
*pb.Message
, caches and returns the result
MsgIDGenerator
? MsgIdGen
object to SetMsgID
and we are set.Agree and I wanted to write a proposed solution but it turned out to be easier to implement it instead of describing it in English for this case :)
Yeah, sometimes it is easier and that's totally fine, I just don't want you to do work that might be thrown away at design stage. Then again you learn the internals better by doing that, so it's not a total loss.
Note that the cache is new field we add on the Message wrapper struct. It doesn't have to be thread safe, as the first thing we do with the message before any concurrency is involved is compute its message id.
Also note that we should relinquish the rlock before invoking the compute function, so that we don't block other stuff while it is computing.
We still need to change MessageCache
a bit as func (m *Message) string
!= func (m *pb.Message) string
. This includes both Add
(to rcv a wrapper) and SetMsgIdFn
(to rcv a generator with wrapper as param) methods.
We don't ahve to change the code that touches the msgid at all actually, we just pass the method in our MsgIdGen object to SetMsgID and we are set.
This is also why I originally thought of the map with *pb.Message
, as we don't have access to the wrapper in most places.
Ah yes indeed, but thats a very small change.
2022-01-07: there is a linked PR (https://github.com/libp2p/go-libp2p-pubsub/pull/465 ) that we're trying to get across the line.
Pubsub only allows setting global
MsgIdFunction
, which applies to all the topics. Still, It would be useful to have the ability to set custom per topic msg id generation, which, e.g., can be dependant on the msg format of some topic. For this to be implemented using the globalMsgIdFunction
, it would need to be aware of all the msg types of each topic and do some ugly switching to determine whatMsgIdFunction
to use for what msg.Fortunately, an unused
TopicOpt
can set custom values while joining the topic, so some per topic customizations are conceived by design, and the described feature can be implemented.