EFForg / starttls-backend

STARTTLS Everywhere web backend and checker
https://starttls-everywhere.org/
Other
18 stars 6 forks source link

Note: The STARTTLS Everywhere project is not currently being maintained. The information and resources on this repository may be outdated. See this post for more information. If you impacted by this news, or rely on the STARTTLS Policy List, you can read this post for a deeper dive.

STARTTLS Everywhere Backend API

Build Status Coverage Status

starttls-backend is the JSON backend for starttls-everywhere.org. It provides endpoints to run security checks against email domains and manage the status of those domain's on EFF's STARTTLS Everywhere policy list.

Setup

  1. Install go and postgres.
  2. Download the project and copy the configuration file:
    go get github.com/EFForg/starttls-backend
    cd $GOPATH/github.com/EFForg/starttls-backend
    cp .env.example .env
    cp .env.test.example .env.test
  3. Edit .env and .env.test with your postgres credentials and any other changes.
  4. Ensure postgres is running, then run db/scripts/init_tables.sql in the appropriate postgres DBs in order to initialize your development and test databases.
  5. Build the scanner and start serving requests:
    go build
    ./starttls-backend

Via Docker

cp .env.example .env
cp .env.test.example .env.test
docker-compose build
docker-compose up

To automatically on container start, set DB_MIGRATE=true in the .env file.

Testing

Test all packages in this repo with

go test -v ./...

The main and db packages contain integration tests that require a successful connection to the Postgres database. The remaining packages do not require the database to pass tests.

Configuration

No-scan domains

In case of complaints or abuse, we may not want to continually scan some domains. You can set the environment variable DOMAIN_BLACKLIST to point to a file with a list of newline-separated domains. Attempting to scan those domains from the public-facing website will result in error codes.

Scan API

Our API objects can look a bit complicated! There's lots of information contained in a TLS scan. To request a scan:

POST /api/scan
  { "domain": "example.com" }

Let's break down exactly what each part of this giant nested response means. All API responses, not just scans, are wrapped in a JSON object, like:

{
    status_code: 200,
    message: "",
    response: <response data>
}

Or even:

{
    status_code: 400,
    message: "query parameter domain not specified",
    response: {}
}

The status codes always correspond with the HTTP status that is given for the response. message provides more context into why your request failed.

Scan responses

Here's an abbreviated scan response. There's extra information on these objects that help describe the errors we encountered.

{
    domain: "example.com",
    scandata: {
        status: 0,
        results: {  // Individual hostname check results
            "mx.example.com": {
                "status": 0,
                "checks": {
                    "connectivity": { "status": 0 },
                    "certificate": { "status": 0 },
                    "starttls": { "status": 0 },
                    "version": { "status": 0 },
                }
            }
            "dummy.example.com": {
                "status": 3,
                "checks": {
                    "connectivity": {
                        "status": 3,
                        "messages": [ "Error: Could not establish connection" ]
                    },
                }
            },
        },
        preferred_hostnames: ["mx.example.com"], // Hostnames we were able to connect to
        extra_results: {"policylist": { "status": 0 }},
    },
    timestamp: 0,
    version: 1,
}

The meat of the response is in scandata, which is a JSON-ification of the DomainResult structure returned from the checker package.

Domain results

Here's a quick synopsis of the fields you see in a domain response:

Hostname results

Here's a sample,

{
    "status": 0,
    "checks": {
        "connectivity": { "status": 0 },
        "certificate": {
            "status": 2,
            "messages": ["Hostname doesn't match any name in certificate",
                         "Certificate root is not trusted"]
        },
        "starttls": { "status": 0 },
        "version": { "status": 0 },
    }
}

What do we scan for?

Right now, these are the checks we perform.

Hostname-level scans

These scans are performed for every hostname-- that is, we try these things for every MX we find for the given domain.

Domain-level scans

These scans are performed for the domain itself.

Rate-limiting, caching, and no-scan lists

We rate-limit several endpoints to prevent abuse and reduce load on our servers. By default, scan requests are cached-- if you're consistently updating your servers and want to check to see if it's passing, we recommend waiting a few minutes and re-scanning.

In case of complaints of abuse, we may not want to continually scan some domains, who can elect to prevent automated scans from this service.