guardianproject / haven

Haven is for people who need a way to protect their personal spaces and possessions without compromising their own privacy, through an Android app and on-device sensors
https://guardianproject.github.io/haven/
GNU General Public License v3.0
6.61k stars 726 forks source link

Add "ProofMode" features for signing and notification #22

Open n8fr8 opened 7 years ago

n8fr8 commented 7 years ago

generate pgp key pair and sign all incident reports send a hash of data over SMS or other service blockchain notarization of hash?

n8fr8 commented 7 years ago
petertodd commented 6 years ago

So lets assume our evil maid has has full, unconditional, root-level access to the device and all data on it. Secondly, lets assume that once they get physical control of the device, they can achieve that access about 1 second (quite plausible with a USB exploit!).

Under this threat model, stored PGP keys are useless - the evil made can simply resign the data. "Blockchain notarization" - which I think you really just mean timestamping, e.g. with my OpenTimestamps - is probably not all that useful either, as it's too coarse-grained to even be applicable.

However we can delete things! Deleting data from RAM can be done in milliseconds, even taking scheduler latency into account, and deleting data from FLASH apparently takes in the hundreds of milliseconds even on the worst case MLC flash.

So I suggest that we do authentication via disposable keys. Basically, we sign an authentication log with disposable private keys, that are securely deleted immediately after they're used. Once the keys are securely deleted even with root access the evil maid can't recreate them, preventing the evil maid from hiding their intrusion. The best they can do is delete the log entirely, which still tells the user that they may have suffered an intrusion. Verification would have to be done via a verification app on a second device, communicating with the first via Bluetooth, Tor Onion, QR Codes, etc.

Implementations

User has two phones, monitor and verifier. Assume they can communicate with each other

Single Key Tamper Detection

Here's the simplest possible implementation, that just detects tampering, without bothering to actually recording an evidence chain.

  1. Monitor phone generates a new keypair, and sends pubkey to verifier.
  2. Starts n second countdown delay, during which sensors are disabled.
  3. Monitor loop: when event detected, delete secret key. Otherwise, repeat.
  4. Verification: at any time, verifier sends monitor a challenge nonce, and checks the returned signature.

Linear Evidence Chain

  1. Monitor generates initial keypair, p_0, q_0, sending initial pubkey p_0 to verifier.
  2. For each event i added to the log:
  3. Generate keypair p_i+1, q_i+1
  4. Sign log entry i containing the entry itself, and pubkey p_i+1, using private key q_i
  5. Delete private key q_i
  6. Repeat!

To verify the log the verification phone:

  1. Challenges monitor to sign a nonce using the most recent pubkey p_n.
  2. Verifies that the log is an unbroken chain of signatures.
  3. Displays evidence data to user.

Note how verification should only happen via the verification phone, not the monitor phone, to ensure that only verified evidence is shown to the user; in the event that the chain is broken, e.g. due to missing evidence, this should be made clear to the user as a potential tampering of the monitor.

Also note how the beginning of each actual sensor event should be recorded as well as the end, if any sensor event takes longer than are assumed minimum evil made interception time.

Data Deletion

Fortunately for us, a tamper log is only valuable if it's complete, which means we don't need any persistent storage of private keys - if the monitor phone is off, our evil made can undetectably modify it. So the private key can be simply stored in RAM, for which deletion is extremely fast. Standard memory retention concerns do apply though - I don't believe that Android phones ever have swap, but I'm no expert.

ragnar48 commented 6 years ago

New to this but used to life. What can I, from the app andriod phone,layman's terms. Cheers.

ragnar48 commented 6 years ago

Reword... What can I expect from the app haven

ragnar48 commented 6 years ago

Reword... What can I expect from the app haven

ragnar48 commented 6 years ago

Reword... What can I expect from the app haven

ragnar48 commented 6 years ago

Reword... What can I expect from the app haven

ragnar48 commented 6 years ago

Reword... What can I expect from the app haven

fuzzyTew commented 6 years ago

This situation is really ameliorated by having some other device on the network that the log is sent to. An append-only protocol such as provided by the dat project or secure scuttlebutt or a low-latency blockchain can be used to ensure the stream is not lost. Once the intruder has the device, the best they can do is change history on the device. They cannot change what has already been broadcast elsewhere (to all other devices in the network, perhaps?).

With recent content of the stream stored elsewhere, regular timestamping is much more useful. Additionally, note that the cryptocoin networks hold transactions within them after broadcast -- even though it may take 10 minutes for a timestamp to enter the blockchain, it could be broadcast immediately, and all the other peers will try to hold it.

A signed chain of "canary" messages can help as well. If the heartbeat broadcast stops, you know the device was changed or disabled.

petertodd commented 6 years ago

@fuzzyTew Remember that a lot of adversaries who might want to physically tamper with a device will also have the ability to deny you network access while they do it; cell phone and wifi jamming isn't very difficult, particularly if you're willing to break the law (just blast RF noise with a cheap jammer for a few minutes).

Re: "holding transactions", any mechanism that uses on-chain transactions directly is quite undesirable from the point of view of cost/scalability. Many scalable schemes that don't - such as my OpenTimestamps - provide no guarantees until an actual blockchain transaction is confirmed (this may change in the future w/ the addition of a secondary trusted timestamping scheme).

Incidentally, what exact attacks do you see timestamping preventing in this use-case?

Re: append-only protocols like secure scuttlebutt, note that they themselves have scalability/cost problems; relying on disposable keys corresponds very nicely to our actual usecase, while also having no scalability issues. Essentially, disposable keys is an append-only log protocol, without the expensive requirement for global consensus.

I do agree that we want any external log-upload scheme to be append-only if possible, and equally, I think a event upload option is valuable (it may be your only chance of getting solid evidence that a break-in actually happened, which may be very valuable if you, say, go to the press with your story!). I just don't think blockchainish protocols are the most effective way to accomplish that.

xloem commented 6 years ago

Your proposal is an append-only log such as provided by scuttlebutt or dat over the network. where scalability shouldn't be a concern because of the small number of devices involved, but it doesn't solve the broadcasting and synchronizing problems they have already worked on.

I glossed over how much of my concerns were actually addressed by your proposal, and I'm sorry for that. I do think peer-to-peer network approaches would be helpful, in case both devices are compromised.

Timestamping is important in case the device containing the log is compromised before you present the log to somebody who can help you. (or before you even view it yourself). Are you https://opentimestamps.org/ ? I think timestamps are awesome for precisely this use case, where a powerless individual could be harmed by a powerful entity and have evidence of it destroyed. The blockchain can store things in a way that is hard to destroy.

EDIT: sorry, I am fuzzyTew .

petertodd commented 6 years ago

@xloem Yes, OpenTimestamps is my project.

Blockchains don't "store" things, they commit to things. In the case of OpenTimestamps, the specific commitment simply proves that some data existed prior to some point in time. As an example in the case of a PGP signature, that can be helpful if you want to verify a signature from a revoked key as the timestamp can prove that the signature was made prior to when the key was compromised, if you know when the key was compromised.

The problem with the Haven use-case is you don't know exactly when the compromise of the Haven data may have happened - when the evil maid entered you room - other than from the evidence itself. But if the attack was successful, the attacker may have simply timestamped a faked log making the whole thing mostly useless.

I would put timestamps - including my own OpenTimestamps - pretty low on the importance list of things that Haven should work on. It may even be downright counter-productive by giving people a false sense of security.

fuzzyTew commented 6 years ago

Not sure what you mean with regard to not 'storing' things in the blockchain, which is done frequently ... but I don't think the differentiation has any implications here.

If data is timestamped, an adversary has a small window of time to alter it, between when it is recorded and when the timestamp is broadcast. And if no timestamp is broadcast that raises a flag. If data is not timestamped, the adversary has all of the rest of time to alter it, as many times as they can. The difference there is very significant.

But for this to be reasonable, first Haven would need to send heartbeats, write to an append-only log ... I agree that it wouldn't be as helpful for the product right now, where there's no way to demonstrate that it was offline at a certain time or that one event happened prior to another.

This could be fudged by having a Signal client on a different device write to a dat hypercore stream and regularly timestamp the tip of the stream. You'd want to change Haven to send out heartbeats, which would be a pretty simple change.

fuzzyTew commented 6 years ago

Meaningful to note that Signal is roughly an append-only log with disposable keys (forward-secret ratchet protocol). It doesn't exactly make it easy to access these things though, to verify anything later, but it's all open source.

petertodd commented 6 years ago

@fuzzyTew Can you name an example of something that's stored on a blockchain? A blockchain is just a data structure, one that does not imply anything about the data actually being kept.

In any case, even if you have second-level resolution timestamps - and we don't - they're still not good enough:

  1. We have to be able to tolerate network dropouts. Consumer wifi and cellular networks just aren't perfectly reliable, and being down for a few dozen seconds (or minutes) isn't very suspicious - happens all the time.

  2. When the network connection is down, no timestamps can be created. They'll instead be delayed until the network is back up, but those timestamps will prove only that the data was created prior to some point after the network connection was restored.

  3. A sophisticated attacker - the type I proposed above - can instead hijack the device to timestamp a fake log, discarding the actual record of their intrusion. The timestamps will be perfectly valid, but all they've proved is that the fake log was created in the past. That's just not very useful.

Timestamps do have some secondary value with regard to presenting evidence to others, but they're not useful for Haven's primary purpose: detecting an intrusion. Thus, implementing them should be a low priority.

xloem commented 6 years ago

@petertodd data stored on blockchains: eternity wall, apertus, sia, filecoin, datacoin but this isn't really relevant to this topic

What do you mean by "second-level resolution timestamps"?

Responses to your points:

1. A. Infrequent 1/week timestamps are still quite helpful; these would not be subject to network dropout.

B. Frequent timestamps means you could learn immediately when the network is lost at your haven space, and _go there_ and _fix the issue_ or _meet the intruder_.

C. Frequent timestamps are still helpful for those times when the network _does not_ drop out.  Not every intruder will jam your lte, gsm, and wifi.  They may only care after they discover the device.
  1. It's still great to create timestamps that prove that data was created prior to the network coming back up. These timestamps prove that the data was not modified after this point, and the great delay in them shows precisely the window during which they may have been modified.

  2. The timestamp describes precisely when this hijack can have occurred. If there is any other reason to suspect anything, it gives a great deal of information for auditing.

We designed Haven for investigative journalists, human rights defenders, and people at risk of forced disappearance to create a new kind of herd immunity.

Forced disappearance is not caused by a single intruder when you are not present. It is caused when you are present, by abducting you, and after you are gone, by erasing records of you. Timestamps could incredibly increase the safety of people at great risk by creating immutable proofs: they cannot be changed and it is obvious if the data is lost, that it used to be there.

petertodd commented 6 years ago

@xloem In half your examples, data isn't being stored on the blockchain; in the other half, what you're actually doing is publishing... But yeah, not relevant to this discussion.

By "second-level resolution timestamps", I mean timestamps where the uncertainty over what the timestamp actually proves is within a few seconds (basically the accuracy of the timestamp). By comparison, in Bitcoin miners routinely create Bitcoin blocks where the time field is inaccurate on the order of multiple minutes off, even an hour or two; if you are assuming miners may be adversarial it's quite possible that the time field may be off by multiple hours, maybe even a whole day: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-September/013120.html

This is why the #1 item in the "known issues" of my OpenTimestamps client software is to figure out how to more accurately describe to users the inaccuracy of their timestamp proof: https://github.com/opentimestamps/opentimestamps-client/tree/dcc45495b682c522170e8c2148b4759632e9d7fa#known-issues

  1. A. There's no reason to have "1/week" timestamps - with OpenTimestamps you can make timestamp proofs for free as often as you want; OpenTimestamps scales. Each timestamp proof is roughly 1KB or so. B. Frequent timestamps are not want allow you to learn that; monitoring is what allows you to learn that (ie Haven's "send a warning message over signal" functionality). Please don't use the term "timestamp" to refer to this.

  2. "It's still great to create timestamps that prove that data was created prior to the network coming back up." <- Timestamps do not prove that! They prove that the data was created prior to some point after the network came up. Please be careful in your terminology here.

  3. Again, the timestamp doesn't do anything by itself; the evidence collected by the sensors is what matters here. A timestamp only proves that evidence was created prior to some point in time. Please don't overstate what timestamps are doing here.

Forced disappearance is not caused by a single intruder when you are not present. It is caused when you are present, by abducting you, and after you are gone, by erasing records of you. Timestamps could incredibly increase the safety of people at great risk by creating immutable proofs: they cannot be changed and it is obvious if the data is lost, that it used to be there.

I think you've confused timestamps for remote logging services; please do not misuse this terminology for that.

Sorry if this stuff comes across as pedantic, but I spend an enormous amount of time professionally in clearing up these confusions over what timestamping (and other "blockchain" crypto) actually accomplishes, both in relation to my OpenTimestamps project, and other consulting I do. The last thing we need in this industry is more "blockchain woo" that overstates what these technologies actually accomplish, and I'd hate for Haven to contribute to this problem.

indiglofae commented 6 years ago

All great ideas. I really like the timestamp idea. If you didn't check in, It would be cool if the app would call you at a pre set time. It would also be nice if someone could program voice recognition or word recognition in for a duress word. Then, everything recorded and sent to the cloud, had changing encryption so if someone tried to delete it, the recording would be burried deeper into the cloud by not allowing it to be deleted. We use TOR so that no one knows our location but we need a network where other users can anonomously contact authorities, in the even that one of ours has a head injury and can't remember that they have Haven, to back them up.

xloem commented 6 years ago

@petertodd Just because something is doing one thing, does not invalidate that it is doing another thing. Blockchain storage systems such as datacoin and eternity wall both store and publish data, using only a blockchain.

1. A. By 1/week I mean the resolution of the timestamp. It is helpful to have timestamps that are only accurate down to the week if that's all the resolution that's available. There are of course blockchains with block times much more frequent than bitcoin (ethereum is usually accurate to 20 seconds!).

B. Good point; it is specifically heartbeats, not timestamps, that would show you that the internet is lost immediately.  But again this is a set of many things here where timestamps are an important component: if published to e.g. bitcoin, they will show you this information even if your personal connectivity to your device is severed.  And of course they will still show you this information later, after the fact, in a provable way.

C. Timestamps are still very important for those cases when the internet is not lost.
  1. Timestamps prove that data was created prior to the timestamp being published and notarized. Obviously the network connection being restored is only one of multiple things that can cause there to be a delay, but we were discussing the threat of the network being taken down for the intrusion. Timestamps prove that data was created prior to the network coming back, if the device is still functioning properly, to an accuracy determined by heartbeat frequency and timestamp resolution.

  2. No, I'm not discussing the sensor recordings that you describe the hijacker as faking here. I'm describing that the hijacker needs to be much more sophisticated, and under greater pressure, to create a fake log with a device sending timestamps. The next timestamp will record the time window during which the log must have been faked; it cannot be faked afterwards. If the device sends heartbeats and the attacker needs to not be found, they need to fake the log within the time that a single missing heartbeat will be acted upon; and if they let that hearbeat be missed, the time they were working will be specifically marked as a time there were network connectivity problems. The timestamp log is the sensor evidence here.

I am not confusing timestamps for remote logging services. I am expressing that timestamps are a crucial component of logging if it is to be resilient against erasure or change, which should be obvious!

What reason is there to not use timestamps in a monitoring system?

xloem commented 6 years ago

Just to follow up, I had another discussion regarding timestamping with peter at https://github.com/opentimestamps/javascript-opentimestamps/issues/22 , and I think a lot of the disagreement has come from my use of the word "timestamp" at all, which does not include content integrity (I felt it did, apparently wrongly).

"Blockchain notarization" here is clearly more than timestamping, in that obviously it would be signed, use a hash strong against collisions, and not be further hashed on untrusted servers. Peter makes the smart recommendation to be sure to consult a qualified cryptographer.

petertodd commented 6 years ago

A timestamp does protect the integrity of content, but only in a very specific way: a timestamp proves the content existed in the past.

Meanwhile "blockchain notarization" simply isn't a well-defined term; it's "blockchain" marketing, not a technical term. Similarly, saying something is "signed" is by itself pretty much a meaningless statement: who is signing that thing?

To have useful discussions about this stuff we have to start with precisely describing what exact bad things are we trying an attacker from being able to do.

petertodd commented 6 years ago

Anyway, I'm glad to help!