cowsql / raft

Asynchronous C implementation of the Raft consensus protocol
https://raft.readthedocs.io
Other
45 stars 6 forks source link

Allow uv_tcp_transport to be used for custom point-to-point communication #191

Open NorbertHeusser opened 2 weeks ago

NorbertHeusser commented 2 weeks ago

In our next use case it would be very helpful to be able to send data from one node to a dedicated other cluster member.

We could implement a redundant communication between the cluster members. But to my experience over the last 25 years of development in network and cluster applications it's always a nightmare to have redundant communication between the same component. As this would require additional listen-sockets and tcp transport this will get a nightmare ion regards of troubleshooting. E.g. the normal cluster communication is working, but the forwarding is blocked by a firewall. Or the other way round.

So our design proposal would be to extend the uv_tcp_transport by an additional custom request with a buffer data. And the application would need to register a callback function to receive the incoming messages. And we would need an additional function to send a custom request. So the data would be moved through the transport layer, but never reach the raft code itself. This would allow our application to send data to other nodes without the need to add a redundant transport layer (and get the potential problems described above).

We have already started implementing a feasibility study on our fork.

Any comments, idea or guidance welcome.

freeekanayaka commented 2 weeks ago

If you don't want to open an additional, I believe there's actually another option that you might want to consider (a technique which is used as well by cowsql).

Basically you could implement a custom raft_uv_transport which wraps/proxies the default TCP-based raft_uv_transport implementation that comes with the library (raft_uv_tcp_init()).

What you would do in your custom transport would be that your custom connect function sends an initial "handshake" (for example 8 bytes) that encodes which type of connection should be established: for example a certain code would mean to establish a regular Raft protocol connection, and another code would mean to establish an application-specific protocol connection.

Likewise, your custom listen function would first read the handshake bytes and the decide weather to hand off the connection to Raft or handle it as application connection.

This of course means that you'd have two separate connections (one for Raft one for the application), but they'd be on the same port.

Would that be acceptable for your use case?

NorbertHeusser commented 1 week ago

Sorry for the late response, was on a vacation last week.

Had time this week to look into the way you solved the problem in the cowsql and pla around a little bit more with the our approach to solve the problem. In general "hook" into the connect and listen function will mitigate some of the problems I initially mentioned (e.g. firewall blocking one of the ports) as it will use the same port. But we still would work with with two different tcp connections. And to my experience there are still lot of error scenarios in which one of the connections somehow stopped working and the other one still does. Additionally some of the general code to read/write to the socket would have to be reimplemented in a very similar way to the implementation in the raft library. In in general I don't like multiple implementations of similar code.

But considering in your own cowsql implementation you already had to solve the same issue like we have I would assume other users of the raft library will run into similar requirements. And therefore would ask you to reconsider, if a more basic approach inside the raft library would make sense. This would allow any user of the raft library to tunnel custom point-to-point messages through the existing transport.

I was able to extend our approach into full working example by extending the example server in the lib. If you would like to take a look. Hook into the start function works, but from a design perspective does not look great. I would prefer to extend the raft_uv_tcp_init or the raft_uv_init to provide an additional custom callback function, which would be invoked in case a custom message comes in. But this would only make sense, if there is a chance to merge the changes into the main fork. Otherwise I would prefer to stay with the current hook as this has minimal conflicts with your repo and will be easier to maintain in long term.

Will be offline tomorrow so no hurry to response before the weekend.

freeekanayaka commented 1 week ago

Hey,

yes, I agree that it's an issue that other consumers might face, and it would be good to have a common approach.

From a very high-level point of view, my feeling would be that it might be better if there was a convenient way to tunnel raft messages through the application's transport rather than the other way round. But I really didn't put much thought on it, and anyway it's probably something to be left for the long-term v1.0 API redesign.

I'll look into what you have put together so far, in principle I'm not opposed to merging something that makes the current v0.x API meet this use case, even if not entirely optimal.