This is proof-of-concept code and is not intended for production use. The protocol details are not yet finalized.
To better understand the context of this research and the previous steps that led to it, read the following blog posts:
Another PoC server implementation in Lua is available in the securedrop-protocol-server-resty repository.
What is implemented here is a small-scale, self-contained, anonymous message box, where anonymous parties (sources) can contact and receive replies from trusted parties (journalists). The whole protocol does not require server authentication, and every API call is independent and self-contained. Message submission and retrieval are completely symmetric for both sources and journalists, making the individual HTTP requests potentially indistinguishable. The server does not have information about message senders, receivers, the number of sources or login times, because there are no accounts, and therefore, no logins.
Nonetheless, the server must not reveal information about its internal state to external parties (such as generic internet users or sources), and must not allow those parties to enumerate or discern any information about messages stored on the server. To satisfy this constraint, a special message-fetching mechanism is implemented, where only the intended recipients are able to discover if they have pending messages.
A preliminary cryptographic audit has been performed by mmaker in December 2023. See https://github.com/freedomofpress/securedrop-protocol/issues/36.
In commons.py
there are the following configuration values which are global for all components, even though not all parties need all of them.
Variable | Value | Components | Description |
---|---|---|---|
SERVER |
127.0.0.1:5000 |
source, journalist | The URL the Flask server listens on; used by both the journalist and the source clients. |
DIR |
keys/ |
server, source, journalist | The folder where everybody will load the keys from. There is no separation for demo simplicity but in an actual implementation everybody will only have their keys and the required public one to ensure the trust chain. |
UPLOADS |
files/ |
server | The folder where the Flask server will store uploaded files |
JOURNALISTS |
10 |
server, source | How many journalists do we create and enroll. In general, this is realistic; in current SecureDrop usage it is typically a smaller number. For demo purposes everybody knows this, in a real scenario it would not be needed. |
ONETIMEKEYS |
30 |
journalist | How many ephemeral keys each journalist creates, signs and uploads when required. |
MAX_MESSAGES |
500 |
server | How many potential messages the server sends to each party when they try to fetch messages. This basically must be more than the messages in the database, otherwise we need to develop a mechanism to group messages adding some bits of metadata. |
CHUNK |
512 * 1024 |
source | The base size of every part which attachments are split into or padded to. This is not the actual size on disk; that will be a bit larger depending on the nacl SecretBox implementation. |
Install dependencies and create the virtual environment.
sudo dnf install redis
sudo systemctl start redis
python3 -m virtualenv .venv
source .venv/bin/activate
pip3 install -r requirements.txt
Generate the FPF root key, the intermediate key, and the journalists' long term keys, and sign them all hierarchically.
python3 pki.py
Run the server:
FLASK_DEBUG=1 flask --app server run
Impersonate the journalists and generate ephemeral keys for each of them. Upload all the public keys and their signature to the server.
for i in $(seq 0 9); do python3 journalist.py -j $i -a upload_keys; done;
Call/caller charts can be generated with make docs
.
bash demo.sh
The demo script will clean past keys and files, flush Redis, generate a new PKI, start the server, generate and upload journalists and simulate submissions and replies from different sources/journalists.
The code in this repository implements three components:
In this proof-of-concept implementation, the components are not fully separated; for example, commons.py
includes code and configuration shared between all components.
Data is persisted in the following ways:
pki.py
is stored on-disk under keys/
. It is accessed there by the journalist client, which uploads public keys to the serverfiles/
files/
directorydownloads/
# python3 source.py -h
usage: source.py [-h] [-p PASSPHRASE] -a {fetch,read,reply,submit,delete} [-i ID] [-m MESSAGE] [-f FILES [FILES ...]]
options:
-h, --help show this help message and exit
-p PASSPHRASE, --passphrase PASSPHRASE
Source passphrase if returning
-a {fetch,read,reply,submit,delete}, --action {fetch,read,reply,submit,delete}
Action to perform
-i ID, --id ID Message id
-m MESSAGE, --message MESSAGE
Plaintext message content for submissions or replies
-f FILES [FILES ...], --files FILES [FILES ...]
List of local files to submit
# python3 source.py -a submit -m "My first contact message with a newsroom :)"
[+] New submission passphrase: 23a90f6499c5f3bc630e7103a4e63c131a8248c1ae5223541660b7bcbda8b2a9
# python3 source.py -a submit -m "My first contact message with a newsroom, plus evidence and a supporting video :)" -f /tmp/secret_files/file1.mkv /tmp/secret_files/file2.zip
[+] New submission passphrase: c2cf422563cd2dc2813150faf2f40cf6c2032e3be6d57d1cd4737c70925743f6
# python3 source.py -p 23a90f6499c5f3bc630e7103a4e63c131a8248c1ae5223541660b7bcbda8b2a9 -a fetch
[+] Found 1 message(s)
de55e92ca3d89de37855cea52e77c182111ca3fd00cf623a11c1f41ceb2a19ca
# python3 source.py -p 23a90f6499c5f3bc630e7103a4e63c131a8248c1ae5223541660b7bcbda8b2a9 -a read -i de55e92ca3d89de37855cea52e77c182111ca3fd00cf623a11c1f41ceb2a19ca
[+] Successfully decrypted message de55e92ca3d89de37855cea52e77c182111ca3fd00cf623a11c1f41ceb2a19ca
ID: de55e92ca3d89de37855cea52e77c182111ca3fd00cf623a11c1f41ceb2a19ca
From: a1eb055608e169d04392607a79a3bf8ac4ccfc9e0d3f5056941f31be78a12be1
Date: 2023-01-23 23:42:14
Text: This is a reply to the message without attachments, it is identified only by the id
# python3 source.py -p 23a90f6499c5f3bc630e7103a4e63c131a8248c1ae5223541660b7bcbda8b2a9 -a reply -i de55e92ca3d89de37855cea52e77c182111ca3fd00cf623a11c1f41ceb2a19ca -m "This is a second source to journalist reply"
# python3 source.py -p 23a90f6499c5f3bc630e7103a4e63c131a8248c1ae5223541660b7bcbda8b2a9 -a delete -i de55e92ca3d89de37855cea52e77c182111ca3fd00cf623a11c1f41ceb2a19ca
[+] Message de55e92ca3d89de37855cea52e77c182111ca3fd00cf623a11c1f41ceb2a19ca deleted
# python3 journalist.py -h
usage: journalist.py [-h] -j [0, 9] [-a {upload_keys,fetch,read,reply,delete}] [-i ID] [-m MESSAGE]
options:
-h, --help show this help message and exit
-j [0, 9], --journalist [0, 9]
Journalist number
-a {upload_keys,fetch,read,reply,delete}, --action {upload_keys,fetch,read,reply,delete}
Action to perform
-i ID, --id ID Message id
-m MESSAGE, --message MESSAGE
Plaintext message content for replies
# python3 journalist.py -j 7 -a fetch
[+] Found 2 message(s)
0358306e106d1d9e0449e8e35a59c37c41b28a5e6630b88360738f5989da501c
1216789eab54869259e168b02825151b665f04b0b9f01f654c913e3bbea1f627
# python3 journalist.py -j 7 -a read -i 1216789eab54869259e168b02825151b665f04b0b9f01f654c913e3bbea1f627
[+] Successfully decrypted message 1216789eab54869259e168b02825151b665f04b0b9f01f654c913e3bbea1f627
ID: 1216789eab54869259e168b02825151b665f04b0b9f01f654c913e3bbea1f627
Date: 2023-01-23 23:37:15
Text: My first contact message with a newsroom :)
# python3 journalist.py -j 7 -a read -i 0358306e106d1d9e0449e8e35a59c37c41b28a5e6630b88360738f5989da501c
[+] Successfully decrypted message 0358306e106d1d9e0449e8e35a59c37c41b28a5e6630b88360738f5989da501c
ID: 0358306e106d1d9e0449e8e35a59c37c41b28a5e6630b88360738f5989da501c
Date: 2023-01-23 23:38:27
Attachment: name=file1.mkv;size=1562624;parts_count=3
Attachment: name=file2.zip;size=93849;parts_count=1
Text: My first contact message with a newsroom with collected evidences and a supporting video :)
# python3 journalist.py -j 7 -a reply -i 1216789eab54869259e168b02825151b665f04b0b9f01f654c913e3bbea1f627 -m "This is a reply to the message without attachments, it is identified only by the id"
# python3 journalist.py -j 7 -a delete -i 1216789eab54869259e168b02825151b665f04b0b9f01f654c913e3bbea1f627
[+] Message 1216789eab54869259e168b02825151b665f04b0b9f01f654c913e3bbea1f627 deleted
FPF
Newsroom
Server
Journalist
Source:
Submission:
Formula | Description |
---|---|
c = Enc(k, m) | Authenticated encryption of message m to ciphertext c using symmetric key k |
m = Dec(k, c) | Authenticated decryption of ciphertext c to message m using symmetric key k |
h = Hash(m) | Hash message m to hash h |
k = KDF(m) | Derive a key k from message m |
SK = Gen(s) | Generate a private key SK pair using seed s; if seed is empty generation is securely random |
PK = GetPub(SK) | Get public key PK from secret key SK |
sigsigner(targetPK) = Sign(signerSK, targetPK) | Create signature sig using signerSK as the signer key and targetPK as the signed public key |
true/false = Verify(signerPK,sigsigner(targetPK)) | Verify signature sig of public key PK using VerPK |
k = DH(ASK, BPK) == DH(APK, BSK) | Generate shared key k using a key agreement primitive |
FPF:
Operation | Description |
---|---|
FPFSK = Gen() | FPF generates a random private key (we might add HSM requirements, or certificate style PKI, i.e.: self signing some attributes) |
FPFPK = GetPub(FPFSK) | Derive the corresponding public key |
FPF pins FPFPK in the Journalist client, in the Source client and in the Server code.
Newsroom:
Operation | Description |
---|---|
NRSK = Gen() | Newsroom generates a random private key with similar security to the FPF one |
NRPK = GetPub(SK) | Derive the corresponding public key |
sigFPF(NRPK) = Sign(FPFSK, NRPK) | Newsroom sends a CSR or the public key to FPF; FPF validates manually/physically before signing |
Newsroom pins NRPK and sigFPF(NRPK) in the Server during initial server setup.
Journalist [0-i]:
Operation | Description |
---|---|
JSK = Gen() | Journalist generates the long-term signing key randomly |
JPK = GetPub(JSK) | Derive the corresponding public key |
sigNR(JPK) = Sign(NRSK, JPK) | Journalist sends a CSR or the public key to the Newsroom admin/managers for signing |
JCSK = Gen() | Journalist generates the long-term message-fetching key randomly (TODO: this key could be rotated often) |
JCPK = GetPub(JCSK) | Derive the corresponding public key |
sigJ(JCPK) = Sign(JSK, JCPK) | Journalist signs the long-term message-fetching key with the long-term signing key |
[0-n]JESK = Gen() | Journalist generates a number n of ephemeral key agreement keys randomly |
[0-n]JEPK = GetPub([0-n]JESK) | Derive the corresponding public keys |
[0-n]sigJ([0-n]JEPK) = Sign(JSK, [0-n]JEPK) | Journalist individually signs the ephemeral key agreement keys (TODO: add ephemeral key expiration) |
Journalist sends JPK, sigNR(JPK), JCPK, sigJ(JCPK), [0-n]JEPK and [0-n]sigJ([0-n]JEPK) to Server which verifies and publishes them.
Source [0-j]:
Operation | Description |
---|---|
PW = Gen() | Source generates a secure passphrase which is the only state available to clients |
SSK = Gen(KDF(encryption_salt || PW)) | Source deterministically generates the long-term key agreement key-pair using a specific hard-coded salt |
SPK = GetPub(SSK) | Derive the corresponding public key |
SCSK = Gen(KDF(fetching_salt || PW)) | Source deterministically generates the long-term fetching key-pair using a specific hard-coded salt |
SCPK = GetPub(SCSK) | Derive the corresponding public key |
Source does not need to publish anything until the first submission is sent.
Only a source can initiate a conversation; there are no other choices as sources are effectively unknown until they initiate contact first.
See the "Flow Chart" section for a summary of the asymmetry in this protocol.
commons.CHUNKS
. Any chunk smaller is padded to commons.CHUNKS
size.file_id
)file_id
-> file
)file_id
) to message mmessage_public_key
)message_ciphertext
)message_gdh
)message_id
) and stores imid -> (ic,iMEPK,imgdh) (message_id
-> (message_ciphertext
, message_public_key
, message_gdh
))message_id
-> (message_gdh
, message_public_key
)):
commons.MAX_MESSAGES - i
] random decoys [0-j]decoy_pmgdh and [0-j]decoy_enc_midcommons.MAX_MESSAGES
(i+j) tuples of ([0-i]pmgdh,[0-i]enc_mid) U ([0-j]decoy_pmgdh,[0-j]enc_mid)n=commons.MAX_MESSAGES
)n=commons.MAX_MESSAGES
)message_id
-> (message_ciphertext
, message_public_key
))file_id
-> file
)message_gdh
)message_id
) and stores mid -> (c,MEPK,mgdh) (message_id
-> (message_ciphertext
, message_public_key
, message_gdh
))message_id
-> (message_ciphertext
, message_public_key
))Source replies work the exact same way as a first submission, except the source is already known to the Journalist. As an additional difference, a Journalist might choose to attach their (and eventually others') keys in the reply, so that Source does not have to fetch those from the server as in a first submission.
For simplicity, in this chart, messages are sent to a single Journalist rather than to all journalists enrolled with a given newsroom, and the attachment submission and retrieval procedure is omitted.
Observe the asymmetry in the client-side operations:
Routine | Journalist fetch and decrypt | Source fetch and decrypt |
---|---|---|
Leg | message_ciphertext,MEPK | message_ciphertext,MEPK |
Step 1. | k = DH(MEPK,iJESK) | k = DH(MEPK,SSK) |
Step 2. | Discard(iJESK) | |
Step 3. | SPK,SCPK,m = Dec(k,message_ciphertext) | mJEPK,JCPK,m = Dec(k,message_ciphertext) |
No endpoints require authentication or sessions. The only data store is Redis and is schema-less. Encrypted file chunks are stored to disk. No database bootstrap is required.
Legend:
JSON Name | Value |
---|---|
count |
Number of returned enrolled Journalists |
journalist_key |
base64(JPK) |
journalist_sig |
base64(sigNR(JPK)) |
journalist_fetching_key |
base64(JCPK) |
journalist_fetching_sig |
base64(sigJ(JCPK)) |
Adds Newsroom signed Journalist to the Server.
curl -X POST -H "Content-Type: application/json" "http://127.0.0.1:5000/journalists" --data
{
"journalist_key": <journalist_key>,
"journalist_sig": <journalist_sig>,
"journalist_fetching_key": <journalist_fetching_key>,
"journalist_fetching_sig": <journalist_fetching_sig>
}
200 OK
The server checks for proper signature using NRPK. If both signatures are valid, the request fields are added to the journalists
Redis set.
Gets the journalists enrolled in Newsroom and published in the Server. The Journalist UID is a hex encoded hash of the Journalist long-term signing key.
curl -X GET "http://127.0.0.1:5000/journalists"
200 OK
{
"count": <count>,
"journalists": [
{
"journalist_fetching_key": <journalist_fetching_key>,
"journalist_fetching_sig": <journalist_fetching_sig>,
"journalist_key": <journalist_key>,
"journalist_sig": <journalist_sig>,
},
...
],
"status": "OK"
}
At this point Source must have a verified NRPK and must verify both sigJ and sigJC.
Not implemented yet. A Newsroom must be able to remove Journalists.
Legend:
JSON Name | Value |
---|---|
count |
Number of returned ephemeral keys. It should match the number of Journalists. If it does not, a specific Journalist bucket might be out of keys. |
ephemeral_key |
base64(JEPK) |
ephemeral_sig |
base64(sigJ(JEPK)) |
journalist_key |
base64(JPK) |
Adds n Journalist signed ephemeral key agreement keys to Server.
The keys are stored in a Redis set specific per Journalist, which key is journalist:<hex(public_key)>
. In the demo implementation, the number of ephemeral keys to generate and upload each time is commons.ONETIMEKEYS
.
curl -X POST -H "Content-Type: application/json" "http://127.0.0.1:5000/ephemeral_keys" --data
{
"journalist_key": <journalist_key>,
"ephemeral_keys": [
{
"ephemeral_key": <ephemeral_key>,
"epheneral_sig": <ephemeral_sig>
},
...
]
}
200 OK
{
"status": "OK"
}
The server pops a random ephemeral_key from every enrolled journalist bucket and returns it. The pop
operation effectively removes the returned keys from the corresponding Journalist bucket.
curl -X GET http://127.0.0.1:5000/ephemeral_keys
200 OK
{
"count": <count>,
"ephemeral_keys": [
{
"ephemeral_key": <ephemeral_key>,
"ephemeral_sig": <ephemeral_sig>,
"journalist_key": <journalist_key>
},
...
],
"status": "OK"
}
At this point Source must have verified all the J[0-i]PK and can thus verify all the corresponding sig[0-n]JE*.
Not implemented yet. A Journalist shall be able to revoke keys from the server.
Legend:
JSON Name | Value |
---|---|
count |
Number of returned potential messages. Must always be greater than the number of messages on the server. Equal to commons.MAX_MESSAGES so that it should always be the same for every request to prevent leaking the number of messages on the server. |
messages |
(base64(pmgdh),base64(enc_mid)) |
The server sends all the mixed group Diffie Hellman shares, plus the encrypted message id of the corresponding messsage. gdh and enc are paired in couples.
curl -X GET http://127.0.0.1:5000/fetch
200 OK
{
"count": <commons.MAX_MESSAGES>,
"messages": [
{
"gdh": <share_for_group_DH1>,
"enc": <encrypted_message_id1>,
},
{
"gdh": <share_for_group_DH2>,
"enc": <encrypted_message_id2>,
}
...
<commons.MAX_MESSAGES>
],
"status": "OK"
}
Legend:
JSON Name | Value |
---|---|
message_id |
Randomly generated unique, per message id. |
message_ciphertext |
base64(Enc(k, m)) where k is a key agreement calculated key. The key agreement keys depend on the parties encrypting/decrypting the message. |
message_public_key |
base64(MEPK) |
message_gdh |
base64(MESK,SC/JCPK) |
curl -X POST -H "Content_Type: application/json" http://127.0.0.1:5000/message --data
{
"message_ciphertext": <message_ciphertext>,
"message_public_key": <message_public_key>,
"message_gdh": <message_gdhe>
}
200 OK
{
"status": "OK"
}
Note that message_id
is not returned upon submission, so that the sending party cannot delete or fetch it unless they maliciously craft the message_gdh
for themselves, but at that point it would never be delivered to any other party.
message_public_key
is necessary for completing the key agreement protocol and obtaining the shared symmetric key to decrypt the message. message_public_key
, is ephemeral, unique per message, and has no links to anything else.
curl -X GET http://127.0.0.1:5000/message/<message_id>
200 OK
{
"message": {
"message_ciphertext": <message_ciphertext>,
"message_public_key": <message_public_key>
},
"status": "OK"
}
curl -X DELETE http://127.0.0.1:5000/message/<message_id>
200 OK
{
"status": "OK"
}
Slicing and encrypting is up to the Source client. The server cannot enforce encryption, but it can enforce equal chunk size (TODO: not implemented).
Legend:
JSON Name | Value |
---|---|
file_id |
Unique, randomly generated per upload id. Files are sliced, paded and encrypted to a fixed size so that all files looks equal and there are no metadata, however that is up to the uploading client. |
raw_encrypted_file_content |
Raw bytes composing the encrypted file object. |
The file_id
is secret, meaning that any parties with knowledge of it can either download the encrypted chunk or delete it. In production, it could be possible to set commons.UPLOADS
to a FUSE filesystem without timestamps.
curl -X POST http://127.0.0.1:5000/file -F <path_to_encrypted_chunk>
200 OK
{
"file_id": <file_id>,
"status": "OK"
}
The server will return either the raw encrypted content or a 404
status code.
curl -X GET http://127.0.0.1:5000/file/<message_id>
200 OK
<raw_encrypted_file_content>
A delete request deletes both the entry in the database and the encrypted chunk stored on the server.
curl -X DELETE http://127.0.0.1:5000/file/<file_id>
200 OK
{
"status": "OK"
}
While there are no user accounts, and all messages have the same structure from an HTTP perspective, the server could still detect if it is interacting with a source or a journalist by observing API request patterns. Both source and journalist traffic would go through the Tor network, but they might perform different actions (such as uploading ephemeral keys). A further fingerprinting mechanism could be, for instance, measuring how much time any client takes to fetch messages. Mitigations, such as sending decoy traffic or introducing randomness between requests, must be implemented in the client.
A known problem with this type of protocol is the issue of ephemeral key exhaustion, either by an adversary or due to infrequent journalist activity.
Attempts by a malicious server to reuse ephemeral keys will need to be detected and mitigated. Key expiration is not currently implemented, but ephemeral keys could include a short (30/60 day) expiration date along with their PK signature. Journalists can routinely query the server for ephemeral keys and heuristically test if the server is being dishonest as well. They can also check during decryption as well and see if an already used key has worked: in that case the server is malicious as well.
One mitigation for behavioural analysis is the introduction of decoy traffic, which is readily compatible with this protocol. Since all messages and all submissions are structurally indistinguishable from a server perspective, as are all fetching operations, and there is no state or cookies involved between requests, any party on the internet could produce decoy traffic on any instance. Newsrooms, journalists or even FPF could produce all the required traffic just from a single machine.
The computation and bandwidth required for the message-fetching portion of this protocol limits the number of messages that can be stored on the server at once (a current estimate is that more than a few thousand would produce unreasonably slow computation times). Messages either need to be deleted upon receipt or to automatically expire after a reasonable interval. If implementing automatic message expiry, the expiration should have a degree of randomness, in order to avoid leaking metadata that could function as a form of timestamp.
Without traditional accounts, it might be easy to flood the service with unwanted messages or fetch requests that would be heavy on the server CPU. Depending on the individual Newsroom's previous issues and threat model, classic rate-limiting techniques such as proof of work or captchas (even though we truly dislike them) could mitigate the issue.
See https://github.com/freedomofpress/securedrop-protocol/issues/14.
To minimize logging, and mix traffic better, it could be reasonable to make all endpoints the same and POST only and remove all GET parameters. An alternative solution could be to implement the full protocol over WebSockets.
Revocation is a spicy topic. For ephemeral keys, we expect key expiration to be a sufficient measure. For long-term keys, it will be necessary to implement the infrastructure to support journalist de-enrollment and newsroom key rotation. For example, FPF could routinely publish a revocation list and host Newsroom revocation lists as well; however, a key design constraint is to ensure that the entire SecureDrop system can be set up autonomously, and can function even without FPF's direct involvement.
A good existing protocol for serving the revocation would be OCSP stapling served back directly by the SecureDrop server, so that clients (both sources and journalists) do not have to perform external requests. Otherwise we could find a way to (ab)use the current internet revocation infrastructure and build on top of that.
This protocol can be hardened further in specific parts, including: rotating fetching keys regularly on the journalist side; adding a short (e.g., 30 day) expiration to ephemeral keys so that they are guaranteed to rotate even in case of malicious servers; and allowing for "submit-only" sources that do not save the passphrase and are not reachable after first contact. These details are left for internal team evaluation and production implementation constraints.