Choice of hosting guidelines in the documentation

globaleaks / globaleaks-whistleblowing-software

GlobaLeaks is a free and open-source whistleblowing software enabling anyone to easily set up and maintain a secure reporting platform.

https://www.globaleaks.org

Other

1.25k stars 274 forks source link

Choice of hosting guidelines in the documentation #1941

Open ralienpp opened 7 years ago

ralienpp commented 7 years ago

After consulting the existing docs I found no tips that assist one in the proccess of deciding where to host my instance of Globaleaks. Suppose I am a journalist in charge of a newspaper and I have no technical expertise; I would look for a tecchie to delegate this to.

The tech-savvy person will then get back to me with a set of hosting options like these:

headquarters of your newspaper, on a separate computer
headquarters of your newspaper on an existing computer
a data-center abroad
data-center in the same country as you are
the home of one of the employees
the home of a trusted person abroad, etc

Each of these options has its pros and cons, depending on the type of country you live in and where it lies on the spectrum between a democracy and a totalitarian regime.

Since every technical person ever asked to help in setting up Globaleaks will most likely go through the same mental exercise, I think it is worth adding this to the official doc.

I could give this a shot and I have several questions about it:

where to best keep this doc? (see issue #1026)
what is the preferred format? (a matrix, a flowchart, etc)

I believe it should be something like this, accompaied with some prose, where necessary: images duckduckgo com

Or maybe like this choice of torrent trackers: http://imageupload.torrentinvites.org/images/999X54mWAF.png

Basically, it has to be easy to follow and the decision-making process has to be clear to a non-technical person, because they must comprehend the implications of each choice.

NSkelsey commented 7 years ago

+10 I think this is a great idea. Providing individuals with the ability to make a sane choice about their hosting options in a clear concise format sounds great.

The technical documentation that is most frequently updated lives on github in the wiki. There is an existing threat model document that provides a matrix that helps answering location anonymity, confidentiality, but it doesn't tangibly answer the question of where to place globaleaks.

Using the right line of questioning for a flow char is crucial. From a crude first try it seems that looking at who has physical access to the box gets us close to a suggested setup.

Can you trust: your janitor [y/n] Can you trust: a local NGO [y/n] Can you trust: a virtual hosting provider [y/n]

Further I would review the guide lines to setup a whistleblowing intiative and see if there is a good place to fit it in there as well.

ralienpp commented 7 years ago

Here is my first take on this, I hope it is sufficient to start a conversation and attract the attention of others. From my perspective, there are 2 main questions:

where to host? (geographically)
how to host? (separate machine, an existing system; a virtualized infrastructure)

I should point out that I wrote this, as someone who comes from a country with a government that frequently bends the law and uses it as a tool to shut opponents up and promote their own agenda. This is why my questions (especially on the local-vs-abroad front) boil down to "how evil is your government?". For other scenarios (as discussed in #1942), the reasoning will probably be different.

The source is in GraphML format, you can edit it with yEd (free as in beer, cross-platform), just save this thing as a .graphml file: https://pastebin.com/Ueak5TjZ

globaleaks-hosting-choice

NSkelsey commented 7 years ago

Great, I will need to get to a machine to work with GraphGL which will take a day or two.

Note that this chart is prefaced with the choice that the operator wants to use a Tor onion service. Hosting a www site on a home internet connection is not an option and typically most NGOs will be placing there infra at hosting providers and not physically on site as well (For QoS of websites, mail, bla bla).

Additionally access out of a office's network usually involves an extra step if the IT team has a strict network security policy that can boil down to them deciding if will they accept an outgoing tor connection - IIRC the port is 9050.

ralienpp commented 7 years ago

Feel free to print it out and make corrections with pen and pencil, or write pseud-code; I will then incorporate that into the chart. In other words, don't go out on a limb to adapt to my choice of GraphML.

It has now occurred to me that perhaps it would be better to use graphviz with planttext, as this would enable us to keep the flowchart in git along with the source code and track its history. Paste the code below into https://www.planttext.com/ to see what I mean:

@startuml
digraph G {
    label="Choice of a hosting method";

    node[shape="box", style="rounded"]
       start; end;
    node[shape="parallelogram", style=""]
       message; input;
    node[shape="diamond", style=""]
       if_valid;

    start -> input;
    input -> if_valid;
    if_valid -> message[label="no"];
    if_valid -> end[label="yes"];
    message -> input;     

    if_valid[label="Is input\nvalid?"]
    message[label="Show\nmessage"]
    input[label="Prompt\nfor input"]
}
@enduml

If you agree this approach is more reasonable, I will rewrite it in this format. With PlantText (a front-end for PlantUML) we can do rapid prototyping and get instant results in graphical format. The downside is that the output is not that pretty.

flipchan commented 7 years ago

You don't need to run ur stuff on a offshore vps account . If you buy one vps/shell account and put the front end under cloudflare so that ur sites ip doesnt show and then like tell noone where it's hosted . You could always pay in Bitcoin:)

ralienpp commented 7 years ago

I don't understand how Cloudflare would be relevant here, because the site is meant to be accessed via Tor - so people will be connecting to you directly (through Tor, of course, but not through Cloudflare). Besides that, Cloudflare knows who you are, and that's too much information for them.

Basically, whistleblowers now depend on Cloudflare's good will.

The need to avoid domestic hosting services comes from the fact that, to my knowledge, when you host an onion server, the machine it runs on makes an outgoing connection to a Tor node (the list is public) at a port number like 9001 (others can be used too, they're in the list of Tor nodes).

If you live in a very rough dictatorship, the government can leverage their power to force ISPs or data center companies to hand them a list of all machines that make outgoing connections to those addresses or port numbers. If there are millions of people using Tor - that won't be very helpful; but if there are few - then this narrows down the pool of candidates. Alternatively, they could leverage their authority to temporarily disconnect some data centers, which will cause downtime for the onion site and enable them to use correlations to narrow down the circle of suspects.

So, depending on the degree of despotism you have in your country, a domestically hosted server can be a bad idea.

If my understanding of how this works is not right, please correct me.

NSkelsey commented 7 years ago

@ralienpp this is the idea. There are many internet freedom projects whose infrastructure is hosted in Northern Europe for exactly this reason.

Tor by default is not steganographic meaning it does not attempt to hide a user's connection into the network. It only works to disguise the traffic within the network from discernment. Now a days they have built many bridges and obfuscating proxies to help clients route around network blocks, but as far as I know that traffic can still be identified reliably as Tor traffic with deep packet inspection.