Closed evanmcc closed 1 year ago
i guess monitoring could be done by connected nodes and gossiped to others when it happen?
I think the simplest thing to do would be to have a monitor function that can work either with a pid
, or a {partisan_remote_reference, Node, {partisan_process_reference, PidAsList}}
. If it is a regular pid
, we could just use erlang:monitor/1
. If it is a remote reference, we could cast a message off to the given Node
with the calling processes remote reference. On the Node
, there would be a process (or group of processes) that would be responsible for monitoring the local pid. Since we have the remote reference for the calling process, the down message can be forwarded off to the original calling process when needed.
I am not sure how clear that is, so I will write it down step by step.
0.0.100
on Node A has a remote reference for process 0.0.200
on Node B0.0.100
on Node A calls monitor
on the remote reference0.0.100
on Node A gets generated and gets sent to the partisan_monitor
process on Node Bpartisan_monitor
process starts monitoring 0.0.200
on Node B0.0.200
on Node B goes down, it will forward the down message using the remote reference of 0.0.100
on Node A0.0.100
on Node A receives the down message and can do whatever accordingly.Hopefully that is a little more clear.
If this makes some sense, I would gladly take some time to get something working.
In v5.0.0-beta I have re-implemented monitoring leveraging the new connection handling which offers fast checks and also a new implementation of on_up and on_down peer service callbacks.
It only works with pluggable service manager ie full mesh. I would love to come up with a design that works for Hyparview soon.
So I will close this issue.
in order for the call emulation in #44 to work, and more generally for partisan to act as a full-featured disterl replacement (see #42), we'll need to add remote montioring. A good design for this doesn't really spring right to mind, I guess, so I am looking for feedback here.
My initial thought was just to add some monitoring metadata on top of the existing node to node data handling (it would work like hello, I guess?). But that can combine with remote node failures in a complicated way, so I need to read more code to have any better fleshed out ideas.