build-trust / ockam

Orchestrate end-to-end encryption, cryptographic identities, mutual authentication, and authorization policies between distributed applications – at massive scale.
https://ockam.io
Apache License 2.0
4.47k stars 562 forks source link

Support TCP keepalive for TCP clients #3450

Open hairyhum opened 2 years ago

hairyhum commented 2 years ago

Current implementations of TCP clients need to rely on ad-hoc heartbeats in order to keep the TCP connection open.

Those heartbeats are implemented as empty Ockam Routing messages, which is not optimal both for encoding/decoding and for router monitoring.

We need to look into supporting more standard TCP keepalive mechanism.

Currently we have heartbeats in:

Instead we should configure TCP connection to use keepalives


We love helping new contributors! ❤️ If you have questions or need help as you explore, please join us on Discord. If you're looking for other issues to contribute to, please checkout our good first issues.

phillyphil91 commented 2 years ago

Hi. I would like to take a look at the TCP keepalive mechanism for the Rust implementation.

phillyphil91 commented 2 years ago

It looks like tokio no longer has the set_keepalive method: https://github.com/rust-lang/rust/issues/69774#issuecomment-1159294804 But apparently the socket2 crate has, which could be worth taking a look at.

phillyphil91 commented 2 years ago

I'm trying to wrap my head around this. So instead of using Ockam's heartbeats (in this case sending a TcpSendWorkerMsg::Heartbeat , you want to use the standard TCP keepalive mechanism which was provided by tokio::net::TcpStream::set_keepalive and is now provided by socket2::Socket::set_keepalive or possibly socket2::Socket::set_tcp_keepalive? So possibly use the aforementioned methods to enable keepalive on tokio::net::TcpStream which is used in the TcpSendWorker struct?

In general, what would be a good place to get some help/guidance on the code? The discussions section? I don't want to spam this issue here.

mrinalwadhwa commented 2 years ago

I'm trying to wrap my head around this. So instead of using Ockam's heartbeats (in this case sending a TcpSendWorkerMsg::Heartbeat , you want to use the standard TCP keepalive mechanism which was provided by tokio::net::TcpStream::set_keepalive and is now provided by socket2::Socket::set_keepalive or possibly socket2::Socket::set_tcp_keepalive? So possibly use the aforementioned methods to enable keepalive on tokio::net::TcpStream which is used in the TcpSendWorker struct?

Yes exactly!

In general, what would be a good place to get some help/guidance on the code? The discussions section? I don't want to spam this issue here.

We're not too particular. Both discussions and this issue are great. We'll try to help where possible.

phillyphil91 commented 2 years ago

I am trying to find out how TcpSendWorker is used, so here are some questions and (probably wrong) assumptions around the general flow of things in ockam_transport_tcp:

The entry point is the TcpTransport struct which calls TcpRouter::register which creates a new TcpRouter which itself in its function definition creates a TcpRouteHandle and then that gets assigned to route_handle attribute of TcpTransport.

When calling connect on TcpTransport this calls connect on the TcpRouteHandle but then I can't find how this relates to the handle_connect method of TcpRouter. There the TcpSendWorker is being used by I can't find any place that actually calls handle_connect, only that it will "Handle any [TcpRouterRequest::Connect] messages received by this nodes worker". I found the TcpRouterRequest enum but again not really how handle_connect and thus the TcpSendWorker is used.

I guess I am wondering if workers/senders.rs is the only place where a changes needs to happen? Also are there tests related to the TcpSendWorkerMsg::Heartbeat variant?

davidag commented 2 years ago

I've been doing some research on using keepalive in Elixir/Erlang and I'd like to share with you my findings and ask some questions:

What do you think?