option to make Sandstorm accessible through Tor with a .onion address

iflowfor8hours / sandcastle

An opinionated configuration for running sandstorm with a focus on security and paranoid assumptions

MIT License

28 stars 10 forks source link

option to make Sandstorm accessible through Tor with a .onion address #14

Open jacksingleton opened 8 years ago

jacksingleton commented 8 years ago

According to @paulproteus we can work on a patch for Sandstorm to prevent sandstorm from leaking its IP address. we should look at that

paulproteus commented 8 years ago

https://github.com/sandstorm-io/sandstorm/pull/447 is the ticket in question.

jacksingleton commented 8 years ago

Issue referenced in the above PR that describes a few current problems with a Sandstorm Tor Hidden service: https://github.com/sandstorm-io/sandstorm/issues/434

Note that there is still some benefit to setting up a hidden service endpoint even if we're not hiding the ip address and location of our server. It would give people a (better) alternative to TLS for transport security (or even a layer on top of it), as well as a .onion address that cannot be censored.

So it might be worth implementing a hidden service endpoint even with the current leaking of IP addresses.

paulproteus commented 8 years ago

If you're interested in hidden services, there are two different Sandstorm services that you might want to use that relate to hidden services:

Static publishing, and
HTTP APIs.

My suggestion would be that static publishing be what you start with. Given that goal, I see three main options on how to do it:

Ask users to install a separate daemon that proxies from a particular hidden service to a Sandstorm static publishing subdomain on the same host, or
Add a special-case to Sandstorm's "pre-meteor.js" file that special-cases .onion and routes things to the right place, which requires that the user separately "apt-get install tor" (or otherwise install Tor), or
Create a Sandstorm app that can listen on the Tor network, and handles all Tor traffic for this hidden service. The app could then offer a "capability" to other apps so that other apps can send static publishing data to the Tor app, which creates a hidden service for the app.

The last of these seems IMHO the coolest. You can get working on it right now, if you want; you could start by using vagrant-spk to package Tor. Note that you'll have to request the IPNetworking and and IPInterface capabilities (I think), so this Tor app would be a "driver".

I can say more about that if you're interested! For what it's worth, the basics on how to make this happen are:

Attempt to create a vagrant-spk app that represents a Tor install that knows how to serve a basic index.html as a hidden service, and
Use http://localhost:6080/admin/capabilities to grant that grain the above capabilities, and
Then we'll have to figure out together how to have your app "offer" the ability for other apps to provide the actual static site content.

I imagine that Kenton and/or other core team members will have opinions about whatever v1 that y'all build, but if you're open to trying it once and then getting feedback and then revising, I think the above strategy would be supremely reasonable.

Some further info can be found in the docs: https://docs.sandstorm.io/en/latest/developing/

But maybe that explains what next steps y'all can do without waiting on us!

jacksingleton commented 8 years ago

Hey thanks for thinking about this!

A Tor driver for Sandstorm sounds really cool. This could definitely be something we look into.

Another thing to think about is using a tor hidden service as a gateway to the sandstorm shell itself (and apps running within it). This is useful because it gives us an alternative to TLS and PKI. This would allow users to create, modify, share (and maybe publish?) grains belonging to any application.

Presumably these two models could coexist.

jacksingleton commented 8 years ago

So I'm thinking the best place to start is a hidden service gateway to sandstorm -- it is something we know we will want eventually regardless, and it doesn't look like too much work is needed to support it.

Seems like the multiple BASE_URL/WILDCARD_HOST support is the only thing we need.

How would you feel about us putting together a PR for this? Outstanding things that I can think of are:

How should we handle OAuth providers? In our case, we're using email auth and have no plans for OAuth.... but I agree that it's a weird edge case to have for someone who wants, for example, "example.com" and "example.net". If they have to configure entries with the oauth provider for two different domains would we be able to link their account?
How we decide what BASE_URL to use for any given request... I assume we'd just check the host header and revert to old behavior ("your dns is configured incorrectly" page) if we get a host header that doesn't match any of the configured options
Syntax for the config file

paulproteus commented 8 years ago

Howdy @jacksingleton !

re: two different BASE_URL/WILDCARD_HOST values

I thought this would be super-tough, since Meteor probably wants just one BASE_URL. But then I saw this: https://github.com/iron-meteor/iron-router/issues/1110 which is about Telescope fighting Meteor's BASE_URL support and winning.

For you, I think that you should do something in pre-meteor.js or similar to annotate the req object with a BASE_URL and WILDCARD_HOST. Basically, when the request comes in, as soon as possible, before Meteor really looks at the request, annotate it with this info.

Then I figure if a request comes in via the Tor hidden service, it'll use the hidden service BASE_URL & WILDCARD_HOST.

Related thoughts here:

Make BASE_URL take a comma-separated list of BASE_URL parameters. Same with WILDCARD_HOST.
Make WILDCARD_HOST accept a new value called auto which auto-calculates a value of * + BASE_URL, so that people can opt into this without extra configuration. (I suggest that in part because I've wanted to do that for a while but I haven't made time yet!) Nearly everyone wants this.
The first value in the comma-separated BASE_URL selection should be the default, and the others should be acceptable ones. It becomes the responsibility of some new part of the code to look at the inbound request and make sure it gets annotated with a valid alternative BASE_URL.
If possible, you should not trust the Host: header in the Tor case, but rather, only accept the *.onion Host header if it authentically comes from Tor. I believe this is important because otherwise someone could deanonymize the hidden service by scanning HTTP port 80 and using the hidden service's name as a Host header, and see who responds. (I accept that this may not be part of your threat model yet, so if it's not, OK, but do leave a comment indicating that we'd need this to be fully safe on Tor.)

re: decide what BASE_URL to use for any given request:

I seem to have discussed this above!

re: syntax for config file:

@kentonv are you OK with BASE_URL becoming a set of comma-delimited values?

BTW @jacksingleton the BASE_URL parsing occurs in C++-land (look for run-bundle.c++). If you want to meet up and pair on a v1 of that patch, that could be good, but you also don't necessarily need to do that yet. If you useshell/run-dev.sh you'll see it reads the config file itself. I recommend prototyping this with shell/run-dev.sh and ignoring C++-land complexity at first.

NOTE: ROOT_URL is what Meteor calls it, and BASE_URL is what Sandstorm calls it. So probably the ROOT_URL environment variable should continue to be set to default BASE_URL (first in the comma-delimited list). In that case, you'll need to use a different environment variable to store the sequence of alternative BASE_URL options.

FWIW, the above design implies that there should still be one default BASE_URL. I think that's sensible, but curious what y'all think.

_Other notes_

You'll probably need to sanity-check the static publishing code to make sure that, if you make a request via the hidden service, the auto-generated static publishing subdomain that you get is on the hidden service too.

jacksingleton commented 8 years ago

If possible, you should not trust the Host: header in the Tor case, but rather, only accept the *.onion Host header if it authentically comes from Tor. I believe this is important because otherwise someone could deanonymize the hidden service by scanning HTTP port 80 and using the hidden service's name as a Host header, and see who responds. (I accept that this may not be part of your threat model yet, so if it's not, OK, but do leave a comment indicating that we'd need this to be fully safe on Tor.)

Oh interesting. I think it’s generally ill advised to run a hidden service that you want to remain anonymous while running the same server on the open web because it’s fairly easy for an attacker to corrolate changes, downtime, performance issues, etc (that can even be triggered by the attacker). But that doesn’t mean nobody will want to do it :) And checking for a response from a host header is definitely a lot easier that the correlation attacks.

Another strategy that crossed my mind was to have the shell look for a X-Sandstorm-Base-URL header that would override whatever BASE_URL was configured. It would be trivial (although admittedly another layer of complexity) for us to have nginx add this header to any request that comes from Tor. Since it’s a dedicated header, it could be a complete override eliminating the risk of information disclosure even if an attacker starts sending the header themselves.

I recommend prototyping this with shell/run-dev.sh and ignoring C++-land complexity at first.

Sounds good!

On Nov 9, 2015, at 8:43 AM, Asheesh Laroia notifications@github.com wrote:

Howdy @jacksingleton !

re: two different BASE_URL/WILDCARD_HOST values

I thought this would be super-tough, since Meteor probably wants just one BASE_URL. But then I saw this: iron-meteor/iron-router#1110 which is about Telescope fighting Meteor's BASE_URL support and winning.

For you, I think that you should do something in pre-meteor.js or similar to annotate the req object with a BASE_URL and WILDCARD_HOST. Basically, when the request comes in, as soon as possible, before Meteor really looks at the request, annotate it with this info.

Then I figure if a request comes in via the Tor hidden service, it'll use the hidden service BASE_URL & WILDCARD_HOST.

Related thoughts here:

• Make BASE_URL take a comma-separated list of BASE_URL parameters. Same with WILDCARD_HOST.

• Make WILDCARD_HOST accept a new value called auto which auto-calculates a value of * + BASE_URL, so that people can opt into this without extra configuration. (I suggest that in part because I've wanted to do that for a while but I haven't made time yet!) Nearly everyone wants this.

• The first value in the comma-separated BASE_URL selection should be the default, and the others should be acceptable ones. It becomes the responsibility of some new part of the code to look at the inbound request and make sure it gets annotated with a valid alternative BASE_URL.

• If possible, you should not trust the Host: header in the Tor case, but rather, only accept the *.onion Host header if it authentically comes from Tor. I believe this is important because otherwise someone could deanonymize the hidden service by scanning HTTP port 80 and using the hidden service's name as a Host header, and see who responds. (I accept that this may not be part of your threat model yet, so if it's not, OK, but do leave a comment indicating that we'd need this to be fully safe on Tor.)

re: decide what BASE_URL to use for any given request:

• I seem to have discussed this above! re: syntax for config file:

• @kentonv are you OK with BASE_URL becoming a set of comma-delimited values? BTW @jacksingleton the BASE_URL parsing occurs in C++-land (look for run-bundle.c++). If you want to meet up and pair on a v1 of that patch, that could be good, but you also don't necessarily need to do that yet. If you useshell/run-dev.sh you'll see it reads the config file itself. I recommend prototyping this with shell/run-dev.sh and ignoring C++-land complexity at first.

NOTE: ROOT_URL is what Meteor calls it, and BASE_URL is what Sandstorm calls it. So probably the ROOT_URL environment variable should continue to be set to default BASE_URL (first in the comma-delimited list). In that case, you'll need to use a different environment variable to store the sequence of alternative BASE_URL options.

FWIW, the above design implies that there should still be one default BASE_URL. I think that's sensible, but curious what y'all think.

Other notes

You'll probably need to sanity-check the static publishing code to make sure that, if you make a request via the hidden service, the auto-generated static publishing subdomain that you get is on the hidden service too.

— Reply to this email directly or view it on GitHub.

kentonv commented 8 years ago

@kentonv are you OK with BASE_URL becoming a set of comma-delimited values?

Sure.

I'm also OK with the other things you suggested. :P

paulproteus commented 8 years ago

: D

The others seemed less invasive to me!

ckxng commented 8 years ago

There are some things that can be done independently of the sandstorm code.

Since this specific information leak comes from DNS, and DNS is a common source of information leakage, we can allow a toggle to resolve DNS via. Tor instead of the default resolvers. This might make sense to be split into a separate issue. This can be done through /etc/resolv.conf (would still allow direct nameserver queries) or by capturing the outbound packets at the firewall (captures all DNS traffic).
There may be threats not yet identified, either in sandstorm or grains. It may be beneficial to implement a toggle that redirects all outbound traffic on standard web ports to be sent through a transparent proxy that will scrub out common leaks. (privoxy comes to mind) While related, this would also probably be better as a separate issue.
The same basic procedure could be applied to ALL outbound traffic to be sent directly through tor's transparent proxy.

I would propose that all three of these features default to true if sandstorm_onion: true. If these preventative measures are combined with the sandstorm enhancements listed above, we can make it much more difficult for information to leak (accidental or otherwise).

paulproteus commented 8 years ago

Cameron, +1 to your thoughtfulness on this.

paulproteus commented 8 years ago

In 1 you write:

we can allow a toggle to resolve DNS via. Tor instead of the default resolvers

I may be missing a word. Via what?

ckxng commented 8 years ago

Tor can accept UDP DNS queries on localhost using the DNSPort option. Queries received on this port will be resolved internally using the same mechanism as the SOCKS proxy. The client never has to know the difference.

But there's a caveat (which is why it's important to provide a toggle), this feature can only resolve A records. Other record types (MX, TXT, AAAA) will be returned as either "NOERROR, Answer: 0" or "NXDOMAIN".

jacksingleton commented 8 years ago

I think the what is Tor. Tor can expose a dns resolver locally which will route all queries through the Tor network.

It would be awesome to do transparent proxying with iptables + privoxy + tor. We can also look at Whonix: https://www.whonix.org/wiki/Hidden_Services

I do want to separate two motivations for running a hidden service:

1) Allowing people to access Sandstorm through the .onion address 2) Allowing a Sandstorm server to operate anonymously

The benefits of 1 are that we provide an alternative (and possibly layer on top of) to TLS/PKI. It makes our service more censorship resilient and harder to man in the middle.

Benefits of 2 include all the benefits of 1 and also make it really difficult to identify where the Sandstorm server is being hosted. The downside is that if this is what you want, you really shouldn't provide any access to that Sandstorm NOT over Tor otherwise you open yourself up to correlation attacks.

paulproteus commented 8 years ago

Semi-sorry about how long this is getting.

@jacksingleton and I talked more today. Here are the minutes from that conversation:

You'll end up with a sandstorm.conf on your sandcastle host that looks like:

BIND_IP=127.0.0.1
PORT=6080
BASE_URL=https://sandcastle.io/,http://something.onion/
WILDCARD_HOST=*.sandcastle.io,*.something.onion

This will flow into Sandstorm as the following environment variables, thanks to run-bundle.c++:

BASE_URL=https://sandcastle.io/,http://something.onion/
WILDCARD_HOST=*.sandcastle.io,*.something.onion

NOTE that this will result in Meteor crashing -- Meteor will hate BASE_URL having a , in it. So:

[ ] ACTION: (owned by @paulproteus) adjust run-bundle.c++ to split the BASE_URL on , and store just the first one in the BASE_URL environment variable.
[ ] ACTION: (owned by @paulproteus) adjust run-bundle.c++ to provide the full list of BASE_URL values as Meteor.settings.baseUrls

So then @jacksingleton can expect that Meteor will receive Meteor.settings.baseUrls as a list of BASE_URL values.

[ ] ACTION: (owned by @jacksingleton): Create a Meteor package called somethingorother
[ ] ACTION: (owned by @jacksingleton): Create a function called matchWildcardHost which takes a hostname and checks it against the list of valid WILDCARD_HOST options, and returns the subdomain component -- this is nearly copy-pasta from shell/packages/sandstorm-db/db.js -- the matchWildcardHost function
[ ] ACTION: (owned by @jacksingleton): Create a calculateValidWildcardOrigins in your package that is very similar to getWildcardOrigin() except it returns a Javascript array, and then update the one caller to getWildcardOrigin to call your function instead (steal code from getWildcardOrigin for easy success; I suggest moving this into your package)
[ ] ACTION: (owned by @jacksingleton): Create calculateEffectiveBaseUrlAndWildcardHost function in your Meteor package that takes a Host: header and verifies that it corresponds to one of the BASE_URL options in Meteor.settings.baseUrls and if so, returns:
- the BASE_URL value, and
- the WILDCARD_HOST value that should be used for this request, and
- an acceptable Origin value, like 'https://foo.rose.sandcats.io' -- see shell/server/proxy.js, search for PROTOCOL + "//" to see why this is needed
- (if not literally the above, something equivalent to the above)
[ ] ACTION: (owned by @jacksingleton): Adjust makeWildcardHost in shell/packages/sandstorm-db/db.js so that it takes a second parameter of the current WILDCARD_HOST for this request, with null as a fallback to mean that we use the old behavior (with the eventual goal of stamping those out)
[ ] ACTION: (owned by @jacksingleton): Adjust all callers of makeWildcardHost to actually send a second parameter
[ ] ACTION: (owned by @jacksingleton): Adjust all readers of process.env.ROOT_URL to either read from the result of calculateEffectiveBaseUrlAndWildcardHost
[ ] ACTION: (owned by @jacksingleton) Create makeWildcardHostWithProtocol(foo) which returns (effectively) ROOT_URL.protocol + "//" + makeWildcardHost(foo) (to replace the current very common idiom of reading the protocol out of the global ROOT_URL)

Other:

[ ] ACTION: (owned by @paulproteus) Look into the methods in hack-session.js and see if they can stop reading ROOT_URL.protocol. Maybe get help from @jparyani or @kentonv.
[ ] ACTION: (owned by @jacksingleton) Use scheme-relative URLs for userIdentities, see shell/packages/sandstorm-db/profile.js -- this can be a standalone pull request
[x] ACTION: Think hard about OAuth.

The goal here is to check which BASE_URL is being used for this request, and consistently use that (and its corresponding WILDCARD_HOST) in responding to this request.

Going to save this now, then edit to make further changes.

elimisteve commented 8 years ago

Hi all,

I agree that having multiple BASE_URLs would be nice, but if the primary goal is to have an "option to make Sandstorm accessible through Tor with a .onion address" rather than simultaneously hosting the same site at a Tor URL and a non-Tor URL, can't this happen much more simply? As far as I can tell, 100% of the items on @paulproteus's handy checklist have to do with making multiple BASE_URLs possible.

What else needs to happen to make it possible to just host a Tor hidden service using Sandstorm? Packaging Tor as an SPK, and https://github.com/sandstorm-io/sandstorm/pull/447 ? Thanks!

paulproteus commented 8 years ago

If you want to have a Sandstorm install that is 100% behind Tor, here is what I would do:

sudo apt install tor or however you like to do that
Tell Tor you want to create a hidden service, and wait for it to tell you the hidden service hostname (e.g. example.onion)
Install Sandstorm, and tell it you want a "full server", hosted at your own domain, example.onion, with wildcard zone being *.example.onion

Then there are two ways forward to limit DNS leakage.

Way 1: Fix Sandstorm issues where the code does DNS lookups

Specifically https://github.com/sandstorm-io/sandstorm/pull/447

Way 2: Constrain Sandstorm so it can't reach the Internet except via a tor-ification proxy

Alternatively, you can limit the Linux VM so that Sandstorm can't communicate with the world, e.g.

iptables -A OUTPUT -o eth0 -m owner --uid-owner sandstorm -j DROP

sudo su - sandstorm -c 'cd /tmp ; wget http://www.google.com/'  # make sure this fails`

Configure a HTTP proxy (that routes via Tor) so that Sandstorm can at least auto-update.
See also: https://github.com/sandstorm-io/sandstorm/issues/693

Let me know if that's responsive to your question, @elimisteve .

jacksingleton commented 7 years ago

look into: https://github.com/alecmuffett/eotk