matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.12k forks source link

Anonymous Homeservers (Tor/I2P) #7088

Open jakehemmerle opened 4 years ago

jakehemmerle commented 4 years ago

Intro

Several people in the Matrix community, including myself, would love to see anonymous homeservers. It didn't seem appropriate to use either of the other I2P or Tor threads, since both had some awesome points by awesome people (looking at you @richvdh @cyphar @vsatmydynipnet @ekleog @ara4n). Big thanks to @ara4n for giving this a proofread before I posted!

Why this Tor/I2P thread

It seemed appropriate to start a new thread (and lock) for the following reasons and with the following hopes:

  1. Create a central thread for the discussion of Tor/I2P homeservers and DNS-addressed homeservers that federate to Tor/I2P homeservers, merging the I2P homeserver thread #5455 with the two Tor homeserver threads #5152 and #2111 (as each thread contained at least some useful and non-duplicate information that I have tried to consolidate in this post). Both protocols could theoretically be implemented, but for the sake of anonymous homeservers, we should just pick one.
  2. Come to a community decision about which anonymity network to use for homeservers and for what reasons
  3. Create a roadmap and/or tickets that I and other community members can start working on

Since this is a large post on a new thread that discusses several steps that are dependent on the previous, for the sake of organization it will probably make sense to break this into smaller threads over time, rename threads, and/or move stuff to a wiki or something.

Tor or I2P?

Here is an excellent and pretty unbiased post comparing Tor and I2P from I2P’s website: https://geti2p.net/en/comparison/tor

(Summarized from above link):

Benefits of Tor over I2P

Benefits of I2P over Tor

I am no expert in network protocols and I don't want to provide an ill-equip opinion, but it seems that Tor HSs would be easier to implement, while I2P HSs appear more 'proper'.

UX and Federation Behavior

I thought it would be useful to include expected behavior in this discussion. This and everything below this will be split into a separate thread with more details after decision on Tor or I2P.

UX (Client)

In a perfect world, I would think we would want the following behavior to apply (I will use terms 'Tor/I2P HSs' and 'DNS-addressed HSs' to describe homeservers that end in .onion/.i2p and in .com/etc for lack of better terminology):

  1. Tor/I2P HSs and DNS-addressed HSs should be able to participate in the same rooms and federate to each other (I can ping @matthew:matrix.org and @somefella:sdfasdfdfd.i2p/.onion in the same message without having to do anything special).

Behind the scenes (Server)

  1. Having a HS that does not support federating to Tor/I2P HSs should not break anything.
    • How would we handle legacy HSs that don't support specifying a Tor/I2P client? Would adding a bridge to the room solve catching Tor/I2P servers up or would it be cleaner to do a breaking server update during a big release? interesting comment related to this by @OlegGirko on #2528 )
  2. In a perfect world, we would have every HS running a Tor/I2P client, providing native federation to all HS types. I don't think this should be a requirement as this will probably induce non-insignificant overhead, but we should include a server config entry to just specify an external relay (ie IP/port of a Tor SOCKS5 proxy). Maybe include native Tor/I2P client in the stable Dendrite release? Food for thought, I'm just dreaming here.
  3. Tor/I2P HSs would have to route all their requests through Tor/I2P while DNS-addressed HSs would split where they route outgoing packets.

@richvdh in #2111 recognized that for both types of HSs to federate to each other, it may be easiest to propose a change to the Matrix specification (specifically 'raising an MSC in the matrix-doc'). See post for context.

Depending on the expected future support of Synapse and with the building of Dendrite, would it make sense to skip building this for Synapse and just implement this for Dendrite?

Next steps

I hope this post finds the community well and brings some organization to the awesome discussions started around Tor/I2P homeservers. What should we start with? Tor or I2P?

jakehemmerle commented 4 years ago

It seems like the best way to not lock specifically into Tor or I2P would be to have some sort of low level Network API that specifies config, transport and authentication for federation. This interface would allow support for Tor, I2P, DNS or any other homeserver type and prevent the need to rewrite the spec to add support for a new homeserver type.

I will try to dive deeper into this and start a spec proposal mid May (Working part time and taking a full load of classes has my hands full until then).

dr-bonez commented 4 years ago

@jakehemmerle thanks for being on top of this. I'm personally invested in adding Tor support, since hidden services solve an addressability problem for people who want to run servers behind a NAT. I agree that having a more modular transport layer for the application would be the best way to do this. This way the main code base shouldn't have to care about how services are being hosted, it should be able to deal exclusively in URLs.

That said, I would recommend prioritizing Tor over I2P when it comes to implementation as Tor has better ecosystem support. Based on my own experience, I have found that integrating Tor is easier than I2P and tends to be more reliable. It may also be worth noting that, as far as I'm aware, there is no I2P browser available for iOS. Additionally, there is a plugin being developed for Ionic Capacitor that will allow adding Tor support to both Android and iOS applications very easily, which will likely be finished in the coming weeks.

Those are my thoughts on the matter. Happy to lend help wherever needed.

vsatmydynipnet commented 4 years ago

This would finally open Matrix/Synapse to everybody and removes the NAT/dynamic IP problem:

https://github.com/matrix-org/synapse/issues/5152

cyphar commented 4 years ago

My one concern is that we need to make sure that .onion homeservers are absolutely forbidden from connecting to anything over the clear-net. This is the biggest issue with hosting an .onion service -- everything from nginx and Apache to PHP and Django seem giddy about disclosing your public IP address.

Another thing I'm a little worried about is while it is necessary to route .onion-bound packets through other servers if you don't have a Tor client locally, it does mean that every homeserver along the chain knows who you're talking to. Then again, this might not be a problem if the routing for messages in a given room only happens within the set of servers that are involved in that room.

dr-bonez commented 4 years ago

I mean it should be configurable. Connecting to another server over clearnet only harms your own anonymity, not anyone else's. So if you're using Tor for NAT punching but not anonymity, you could use the proxy for hidden services only. But I agree that by default it should use the proxy for everything.

As for the second point, long term we could bundle a Tor client, and only run it if we don't detect anything listening on 127.0.0.1:9050. Remote Tor proxies should be an advanced configuration option, you need to know what you're doing if you use one.

jakehemmerle commented 4 years ago

while it is necessary to route .onion-bound packets through other servers if you don't have a Tor client locally, it does mean that every homeserver along the chain knows who you're talking to

@cyphar If I understand your statement correctly, why would it matter if another server (who/whatever is hosting the Tor SOCKS5) can see the data going to the onion .onion homeserver, if all that's in there is the .onion address? I don't see any sort of malicious stuff that could happen other than the hoster of the SOCKS5 intentionally dropping packets.

My one concern is that we need to make sure that .onion homeservers are absolutely forbidden from connecting to anything over the clear-net. This is the biggest issue with hosting an .onion service -- everything from nginx and Apache to PHP and Django seem giddy about disclosing your public IP address.

It might be worth 'highly recommending' that people running .onion homeservers host their server via Docker by simply spinning up a hardened 'official' docker-compose.yaml file instead of configuring nginx and Tor themselves. That would make it even easier to pass in fun things like custom .onion addresses via Shallot or something

dr-bonez commented 4 years ago

shallot is v2, so probably should use https://github.com/cathugger/mkp224o instead

cyphar commented 4 years ago

@jakehemmerle

If I understand your statement correctly, why would it matter if another server (who/whatever is hosting the Tor SOCKS5) can see the data going to the onion .onion homeserver, if all that's in there is the .onion address? I don't see any sort of malicious stuff that could happen other than the hoster of the SOCKS5 intentionally dropping packets.

It makes it so the relaying server is aware of the social graph of people talking. For instance, knowing that user @a:abc.onion and @b:b.com are speaking together (or @b:b.com is speaking in !foobar:abc.onion) may betray some sensitive information. Even the knowledge that @b:b.com is speaking to someone on abc.onion might be a problem if abc.onion is (for instance) a server for HIV-positive users or whistleblowers or pick-your-own-vulnerable-group. This is information that would not be known by anyone not in the room if b.com routes through Tor.

Leaking such information does defeat some of the anonymity properties of .onion addresses. If you compare this to other chat systems built on top of .onion addresses (such as the sadly dead Ricochet), they specifically protect against social graph building through the use of .onion addresses. Obviously Matrix will probably never be as privacy-preserving as tools like Ricochet but we should still make an effort.

My point is that if we're going to do clear-net packet routing to people who are on .onion servers or for conversations in .onion rooms, it must be done by routing through homeservers that are currently involved in the room the message was sent in. At the very least this reduces the scope of the information leak.

It might be worth 'highly recommending' that people running .onion homeservers host their server via Docker by simply spinning up a hardened 'official' docker-compose.yaml file instead of configuring nginx and Tor themselves.

Sure, though I imagine a lot of the work would also involve making sure that Twisted doesn't leak such information. Docker doesn't solve all of your problems (how do you make sure you don't leak information about the host system over the internet, and so on). Don't get me wrong, these are all stretch goals but are things that should be kept in mind if the intention is to get people to host .onion homeservers for privacy-preserving reasons. If the intention is just to get around NAT, then it's a different story.

jakehemmerle commented 4 years ago

Even the knowledge that @b:b.com is speaking to someone on abc.onion might be a problem if abc.onion is (for instance) a server for HIV-positive users or whistleblowers or pick-your-own-vulnerable-group.

I don’t think that matters (or rather, it matters much less) that it is known that someone on a clear-net homeserver is speaking to an onion homeserver (even though you can make associations like you described above). If a user ranks anonymity as a high priority for them, they should use an onion homeserver anyway (or at the very least, a clear-net homeserver through Tor, but still opens up that clearnet server to censorship or targeted attacks).

An onion homeserver should simply provide anonymity to its users, not necessarily to clearnet servers/users in a room it’s hosting. If we want more privacy from clear to onion (or even clear to clear), that should be a separate proposal (although related).

I imagine a lot of the work would also involve making sure that Twisted doesn't leak such information. Docker doesn't solve all of your problems (how do you make sure you don't leak information about the host system over the internet, and so on).

A crafted docker-compose file would eliminate/reduce the issue of a misconfigured or information-exposing service like nginx as well as Twisted reducing privacy. Docker itself doesn’t inherently remove these issues, and Twisted under the hood may want to expose things, but forcing a container (Synapse) to route all traffic through another container (a router configured to route all traffic through Tor) does.

jakehemmerle commented 4 years ago

Docker doesn't solve all of your problems (how do you make sure you don't leak information about the host system over the internet, and so on).

Rereading this a few months later I realize I misunderstood what you were saying! I'm not sure if theres a way to completely eliminate this issue at the docker level. Someone who knows docker and security pretty well would have to answer that.

jakehemmerle commented 4 years ago

https://www.ndss-symposium.org/wp-content/uploads/2020/02/24199-paper.pdf

Stuff like 0-RTT TLS, TCP fast open, and (if going the I2P route) QUIC should speed up some of the loading time. I expect this to be pretty slow trying to federate everything, but I could be wrong. Anyway, this paper might come in handy at some point.

hstock commented 4 years ago

Support for ENS domains (https://ens.domains/) and/or unstoppable domains (https://unstoppabledomains.com/) could be interesting. This would allow decoupling of anonymous domains and the type of anonymizing technology used. (if for example Tor and I2P are implemented in the future)

jakehemmerle commented 4 years ago

@hstock that's a great idea! I'd like to add Handshake to the list too (https://handshake.org).

trymeouteh commented 3 years ago

Would be good to see this and Tor/Onion sites and IPFS/Unstoppable Domains support without the clients like element needing to work with I2P, Tor or IPFS. If the synapse can support I2P, Tor and IPFS then matrix can be really decentralised and unstoppable.

dr-bonez commented 3 years ago

element web already supports tor if you run it in a tor browser ;)

Legogris commented 3 years ago

@dr-bonez The discussion here is regarding the federation protocol, that is homeserver-to-homeserver.

dr-bonez commented 3 years ago

I know, I was responding to @trymeouteh's comment about not waiting for tor support from clients. Just saying it's not a concern anyway, since web clients can easily be supported.

ghost commented 3 years ago

A lot of people need this feature. Waiting...

jakehemmerle commented 3 years ago

A lot of people need this feature. Waiting...

feel free to start on the protocol proposal, thats step 1

irelativism commented 3 years ago

Good luck with this! Would be great to have it :).

But if I had to be honest and blunt, I dont see element team implementing this anytime soon and for some of its members this would never get implemented if they had to decide. Element company grew to fast, and unfortunately a significant part of it shows an aversion to privacy and a bias towards KYC, the hiring process was rushed so unfortunately it didnt atracted the most ideal people for development to truly make the matrix project universal. That is also why you dont see greater adoption by other projects like people from the xmpp camp, or other decentralized independent projects/companies (zulip, mattermost, etc) even from some distributed projects like Jami or Tox.

This doesn't apply to all team members though, but there is a big part of the composition of the element team that will probably harshly and silently criticize this proposal and probably try to brush it under the rug. The path I see forward is not in matrix.org issue tracker but other server implementations such like conduit or others https://gitlab.com/famedly/conduit , they will probably implement that straightway if PR is provided, given it aligns with their ideals, and if you cant do that, it is probably in their roadmap already so if you wait enough, that will probably become a reality.

dr-bonez commented 3 years ago

Timo has already agreed to accept a pr that adds tor support to conduit. It would be very simple, too, given that they are using reqwest, which has socks5 support. I just haven't gotten to it yet.

dr-bonez commented 3 years ago

It's possible that only #9306 is needed to get this working...

Asara commented 3 years ago

It's possible that only #9306 is needed to get this working...

Seems like that has been superseded by https://github.com/matrix-org/synapse/pull/10475 which was merged into 1.41.0.rc1.

m00nwtchr commented 2 years ago

Question: Couldn't there be something like a "tor/pull mode" in which federation is done over a outbound connection initiated by the tor homeserver (and thus solving the need for a tor client on the other homeserver)?

Kreyren commented 1 year ago

Can anyone update on what is the current state of this issue? There seems to have been a lot of work done so it's hard to track what needs to be done to get tor homeservers, thanks

CC @Legogris as the only project member reacting to this issue

Pheromon commented 1 year ago

Just a note of encouragement: I miss this feature very much :-)

Legogris commented 1 year ago

Clarification @Kreyren : I am not a project member more than anyone else having contributed to the project (:

From a quick look I can't find much ongoing or recent work?

I have not tried this myself, but as of #10475, synapse should already be able to federate over tor by setting a SOCKS proxy for federation requests. For inbound federation, it would involved setting up a Hidden Service pointing to the federation port. This will only work with .well-known, not SRV records, since DNS works differently for tor.

The missing piece here is would be verifying the onion certificates for federation. I guess this means that synapse can federate fine over tor today, but only if the homeserver disables tls verification for .onion domains (which should be fine assuming tor validates this proberly already).

1277 seems to be linking all relevant open issues.

ghost commented 1 year ago

Can anyone update on what is the current state of this issue? There seems to have been a lot of work done so it's hard to track what needs to be done to get tor homeservers, thanks

@Kreyren: I think I just did that here: https://github.com/matrix-org/synapse/issues/5152#issuecomment-1532280232

@Legogris: Indeed, it works as you described, although lack of SRV support in Tor DNS resolver caused federation to fail, but I made a patch already. Cert validation should probably be disabled for .onion targets.

Buttars commented 1 year ago

I really appreciate the work @Legogris and @ghost are doing but it seems like there is only effort for a TOR specific strategy utilizing SOCKS proxies. I think there are valid use cases to support other anonymization networks other than the onion router.

From my limited research it sounds like I2P supports SOCKS to a limited degree but does not support out proxies (I2P equivalent of exit nodes). I'm not sure if that means that users relying on a homeserver behind I2P would result in only being able to communicate to other homeservers running behind I2P. Does anyone have any experience using I2P especially with SOCKS proxies?

den0621 commented 1 year ago

@Buttars: I2P is not made for clearnet access ingrained to the protocol like Tor. It's primarily designed for hidden services. It's intended as a segregated network. It does however provide an outproxy support where you can put I2P URL(s), but those really are just proxies to clearnet. Very few and volunteer run. I would absolutely not rely on them for core functionality.

For such internal connections, SOCKS should work just fine. If you want "proper" integration with I2P, you're supposed to use the SAM interface. With SAM, your app can manage the creation and settings for tunnels. SOCKS should be fine for "connect to this .b32.i2p address" though. The user just has to make sure it has a tunnel with SOCKS running which is easy enough.

Personally I (and I believe most I2P users) would be happy with a darknet only matrix server that does not connect to clearnet at all. On top of this, there could be servers that can connect to clearnet, Tor, and I2P at the same time. Those themselves wouldn't be anonymous of course but still allow for anonymous users, and act as bridges between networks. Let's say for example, Tor project might be interested to have a channel that can be accessed both from clearnet and Tor. (IDK if they do, just an example). Proxying to a clearnet server through Tor might be interesting, but I would not bother with this for I2P at all. Specific clearnet+I2P servers acting as bridges is enough.

Buttars commented 1 year ago

@den0621 I agree the I2P implementation would have to be anonymous network only or only a gateway from the client to the homeserver and clearnet after that.

I've more clearly defined the different use cases and desired features in this comment on this other thread here. https://github.com/matrix-org/synapse/issues/5152#issuecomment-1764060110