vacp2p / research

Thinking in code
MIT License
62 stars 4 forks source link

feat: auto-generate SSL certs to enable operators to run WSS out of the box #139

Open fryorcraken opened 1 year ago

fryorcraken commented 1 year ago

Problem

Currently, js-waku only support wss to connect to other nodes.

Works is WIP for:

However, they both come with limitations:

For a node operator to accept incoming WSS connection they need:

  1. a domain that points to their node
  2. a SSL certificate for said domain

While (2) can be automated with letsencrypt, (1) is trickier as it means the operators needs to: a. acquire the domain b. setup the domain to point to their ip

This costs money, time and effort.

Finally, note that SSL certs are used to:

Libp2p, an hence Waku, has its own mechanism for this purpose:

Hence, in this use case, the SSL cert for WSS is only a technical barrier.

Suggested solution

Solution inspired by @Menduist https://discord.com/channels/864066763682218004/1019621534769352904/1022411714492375081

The idea would be for a DP (domain provider) to do (1) and (2) for the node operator:

  1. DP acquires a domain name, e,g, wakudomains.abc
  2. Alice is a node operator, she requests a certificate to DP (new protocol).
  3. DP confirms Alice's IP and Peer Id
  4. DP set DNS Entry .wakudomains.abc to point to own node, <peer-id> being a short encoding of Alice's peer id.
  5. DP runs letsencrypt DP node to get SSL Cert for .wakudomains.abc
  6. DP set DNS Entry .wakudomains.abc to point to 1.2.3.4
  7. DP sends SSL cert and key to Alice
  8. Alice use SSL cert with domain .wakudomains.abc to secure wss connection.

Security considerations

Domain name censorship

States actors can censor domain names. DP could decide to take down DNS entry.

mitigation:

Anyone can be a DP. Hence, we could recommend node operators to get certs from several DP (and hence domains) to mitigate the risk of censorship.

DP could be incentivized when providing a domain to encourage a multitude of domain names to be used.

Domain redirected to another IP

DP could decide to redirect some records to their own nodes, to decrease Waku decentralization or coordinate a sybil attack

mitigation: a node should connect using a multiaddr with both peer id and fqdn (ie, ENR), hence this risk is moot as long as peer ids are kept in the peer discovery protocols.

DP banned by LetsEncrypt

LetsEncrypt rate limits needs to be taken in account as a DP should avoid being banned because it tries to get too many certs for LetsEncrypt's liking.

mitigation: Encourage several DP, DP set mechanisms to avoid hitting rate limits.

Other comments

I believe the proposed protocol would work with current technology, without needing coordination with LetsEncrypt contrary to https://github.com/libp2p/go-libp2p/issues/1360

Menduist commented 1 year ago

Cramming thoughts here:

https://zerossl.com/ apparently let users create a certificate for an IP, which would solve all of our troubles. Unfortunately, it seems to be a manual process to create an account, so not really "plug-and-play"

https://zerossl.com/documentation/api/create-certificate/

certificate_domains | [Required] Use this parameter to specify one or multiple comma-separated domains (or IP addresses) to include in your certificate.


DP could decide to redirect some records to their own nodes, to decrease Waku decentralization or coordinate a sybil attack

The connection would fail no matter what (since the PeerId in the multiaddress will be incorrect), so we don't really need bit


Otherwise, relevant for Let's Encrypt, the rate limits: https://letsencrypt.org/docs/rate-limits/

The main limit is Certificates per Registered Domain (50 per week)


If I go back to my first idea a bit, couldn't we do:

Protocol would be a lot simpler, though we will still hit let's encrypt rate limiting since everything is under the same domain. Might be worth looking at how "free dynamic dns" handle the let's encrypt rate limiting, I'm sure they have the same issue


If we go back to your plan, Since the limit is "number of certificates requested per domain", and you can cram multiple hostname per certificate: You can combine multiple hostnames into a single certificate, up to a limit of 100 Names per Certificate If you batch request in advance, you could get 100 * 50 = 5000 domains / week It means that multiple people will share the same certificate, but shouldn't be an issue

Here, the real challenge will be, how to avoid "DDoS" on DP. We would limit at 1 certificate per ip, but still, getting IPs is pretty cheap nowadays (thank you NordVPN for sponsoring this message), so one could exhaust 5k domains quite easily in a week.

Menduist commented 1 year ago

Tried zerossl for my node: The process to create an account is straightforward. I was hoping to build a "mini certbot", that would discover your public IP with UPnP, create a certificate for it using the ZeroSSL api, and keep it up to date (renew every 90 days, create a new one if IP changes, etc)

Unfortunately, to verify your IP, you need to put a file at http://[yourip]/.well-known/pki-validation/[the file provided by ZeroSSL]. And on my router at least, I cannot create an UPnP binding for the port 80, I have to create it manually.

So if we go the zerossl route, here is what the process would look like:

Why do we need a "small certbot"?

So all of this wouldn't be zero-conf, but still a lot easier than setting DNS, imo

fryorcraken commented 1 year ago

** Alice can then request a certificate for [her_ip].somedomain.io or similar domain

Ah I see, yes makes sense but it means Alice needs to run certbot somehow (could be a script, etc). I felt it'd be easier to have most moving parts on the DP side.

Might be worth looking at how "free dynamic dns" handle the let's encrypt rate limiting, I'm sure they have the same issue**

With a Dynamic DNS, you don't need a new certificate. You just need to update the DNS. Am I missing something?

If you batch request in advance, you could get 100 * 50 = 5000 domains / week It means that multiple people will share the same certificate, but shouldn't be an issue

That's good. A DP could just accumulate requests and batch them every ~4 hours (to not hit the 50 requests per week).

It does mean waiting 4 hours for Alice to have websocket enabled. She can still use her node in the meantime via tcp.

It means that multiple people will share the same certificate, but shouldn't be an issue

I agree, in our case wss brings no security, we could do with plain websocket if browsers would allow it (but they don't in a secure environment).

Here, the real challenge will be, how to avoid "DDoS" on DP.

Few thoughts:

  1. DP accumulate requests, up to 100, and submit them every 4 hours, so that rate limits are not hit. Which means the risk is more about flooding the queue with spam requests than hitting rates.
  2. One request per ip sounds good (also one request per peer id).
  3. DP verifies that Alice is reachable. ie, when Alice sends a requests for a certificate, she needs to include her ip+ws multiaddr. Then DP can do a dial to it. This helps: a. confirms that Alice is hosted on the IP (thanks to the peer id) b. host is reachable (ie, port is open), this is not as easy when using a VPN (or is it?) c. Note: the idea would be for Alice to listen with ws until she gets the certs, so that it confirms her setup (port open) is correct

Note: NordVPN has 5567 servers so yes a DP could get spammed on the first week but it's still limited.

If with these mitigations, there are still DDOS issues then a DP may choose to only provide certs to RLN members.

Tried zerossl for my node:

Thank you for that.

Yes, I agree it looks great. It also reinforce my initial thoughts that most of the heavy lifting should be on the DP side so that the user does not have to open 80 ports, install certbot etc. In the protocol I propose, the user only has to:

Then the rest could be handled entirely by the DP and nwaku (including writing ssl file locally and loading at next restart)