Closed mcb30 closed 4 years ago
The app should instead be generating a key pair locally. It should then create a signature over its own public key using the corresponding private key, and then communicate only the public key and the signature (to prove ownership of the corresponding private key). The private key itself should never leave the device. See, for example, the way that SubjectPublicKeyInfo is signed in an X.509 certificate, or the way that a SeedReport is signed in the decentralised contact-tracing CX protocol (https://github.com/ipxe/cx/releases/download/v0.10.1/cx.pdf section 5.3).
This matches the design here - https://www.ncsc.gov.uk/files/NHS-app-security-paper%20V0.1.pdf
"Device contacts the registration service with the activation code, which returns the InstallationID and symmetric key."
The symmetric key is called the secret key in the code, so please review the PDF to understand how it's used and why it is server-generated.
Reading through what documentation we have, and through the codebase itself of the application, it seems like this "Secret Key" is the key used to create the HMAC signature sent as part of the Bluetooth transmissions, not a private key within a Public-Private keypair. This is most likely the "Symmetric Key" detailed in https://www.ncsc.gov.uk/files/NHS-app-security-paper%20V0.1.pdf The following are the handful of uses I could find for the key - do let me know if there's any I've missed:
https://github.com/nhsx/COVID-19-app-Android-BETA/blob/43a167f8dba422fd9001b64f9c4fd82275abb1c8/app/src/main/java/uk/nhs/nhsx/sonar/android/app/crypto/BluetoothIdSigner.kt#L11-L24 https://github.com/nhsx/COVID-19-app-Android-BETA/blob/43a167f8dba422fd9001b64f9c4fd82275abb1c8/app/src/main/java/uk/nhs/nhsx/sonar/android/app/referencecode/ReferenceCodeApi.kt#L15-L31 https://github.com/nhsx/COVID-19-app-Android-BETA/blob/43a167f8dba422fd9001b64f9c4fd82275abb1c8/app/src/main/java/uk/nhs/nhsx/sonar/android/app/diagnose/review/CoLocationApi.kt#L16-L33
A HMAC key needs to be known by both parties in order to verify messages using it - and I assume they decided assigning a valid one for each user would be less risky than having a user provide their own initially. There may be other issues here - but the transfer of the private part of a keypair without any protection doesn't seem to be one of them
Reading through what documentation we have, and through the codebase itself of the application, it seems like this "Secret Key" is the key used to create the HMAC signature sent as part of the Bluetooth transmissions, not a private key within a Public-Private keypair. This is most likely the "Symmetric Key" detailed in https://www.ncsc.gov.uk/files/NHS-app-security-paper%20V0.1.pdf
You're right, but this makes the issue worse than initially reported. There seems to be nothing within the protocol that allows for verification that the Bluetooth message was originally generated by the device itself rather than by something with access to the central server's database and hence access to the "Symmetric Key".
The message content proves ownership of the key pair Priv_Device,t
and Pub_Device,t
used to generate the message, but the only guarantee that this key pair was actually generated by the device corresponding to InstallationID
comes from the HMAC signature by the "Symmetric Key". Since the central server knows the value of "Symmetric Key", it can produce validly signed messages for a forged key pair Priv_Device,forged
and Pub_Device,forged
.
The ability for the central authority to forge validly signed Bluetooth messages remains. You have shown that fixing this issue will require a wire-level change to the Bluetooth message format, rather than requiring only a change to the registration HTTP API.
This assumes that there are no constraints on the ephemeral key pairs chosen by the device. The code that generates them in EphemeralKeyProvider
accepts no outside parameters, so it seems safe to assume that it is generating completely arbitrary keys.
The messages are authenticated for the benefit of the central server. What would be the use of that server faking its own data?
The messages are authenticated for the benefit of the central server. What would be the use of that server faking its own data?
The threat model is not the central server faking data for its own consumption. The threat model is a malicious attacker (e.g. a criminal or state actor) who gains read access to the central server's database.
As a relatively benign example: suppose that I compromise the central server and thereby gain access to your Installation ID
and Symmetric Key
. I can then impersonate you. I can wander around transmitting Bluetooth messages that will be recorded by other app users. The central server will treat every single one of my forged messages as legitimately having been sent by you. There is no way for it to know otherwise.
As a slightly less benign example: suppose that I am a state actor wishing to forge an evidence trail for a suspect I wish to frame for a crime. I access the database and then do exactly the same thing: I transmit Bluetooth messages in the location of my choosing. I later use the digitally signed messages recorded by other app users as evidence of the suspect's location at trial.
A general principle is that a compromise of an authentication token database should not allow for impersonation using the compromised tokens. This is why, for example, password databases store hashed passwords rather than plaintext passwords. Knowledge of the hashed password allows you to verify the password, but does not (directly) help you to discover the plaintext password itself.
It's the same principle at work here. The central server requires only the knowledge of how to verify the received messages. In the current design (which chooses a shared symmetric key), it also has the knowledge of how to sign messages. This is the design flaw that leads to the attack.
You're right that this makes it unsuitable as a forensic audit log.
You're right that this makes it unsuitable as a forensic audit log.
A tool such as this, if it ever gets sufficiently wide adoption, will be treated as though the data captured is trustworthy, and will inevitably see its data (ab)used in scenarios other than the original use case.
There are two options to solve this issue:
Option 1 is an irritating fix to have to make but is entirely possible, assuming that the protocol was designed in a way that allows for protocol versioning. (If it doesn't, then that's an even bigger problem.)
Option 2 is never going to happen.
I have to agree with mcb30 here, this creates an unacceptable long term risk to the public and is entirely, easily avoidable.
Using a public/private key pair to transmit a symmetric key is fine in the sense that the transmission of the symmetric key is secure, how long the symmetric key is allowed to be used though is important. If the permitted session length is very short then this limits the damage that could be done by the key being discovered. That being said, from reading the documentation the symmetric key in question is intended to be used to authenticate the device.
Another approach perhaps could be that during app registration the device generates a random series of characters (i.e. a strong password) and submits it to the server in plaintext, which stores a hash of it. When the device then sends future submissions to the server it includes its plaintext password along with them. The server then generates a hash of the received plaintext and compares it with the hash it has stored. This process could be described as the device performing a silent login. Since (unlike a cipher) a hash cannot be reversed back to its plaintext, knowledge of the hash could not be used to manufacture / impersonate a user submission like the way described in the threat model stated in the earlier post.
@mcb30 it's explicitly described in the documentation as a symmetric key. Someone would need to read only particular lines of code out of context to believe the server doesn't know it. In other protocols, a shared HMAC is used to claim plausible deniability.
It's also not as simple a change as it sounds. The server still needs to authenticate messages, so a signature will add 512 bits to the 1024-bit payload. To prevent an attacker reusing a signature, BV would need to be re-signed more frequently than every 24 hours, which consumes battery. The new public key must be protected to prevent tracking, but must also be revealed to dispute the forged signatures. And this still doesn't cover what happens when it's uninstalled, or the more likely attack of rebroadcasting messages in realtime.
The risk to the public from abusing data should be tackled by legislation, as demanded by JCHR https://committees.parliament.uk/committee/93/human-rights-joint-committee/news/146351/report-on-the-contact-tracing-app-published/
Shouldn't the device generate the private/public key locally and issue the public key back to the server on registration? I'm not sure I follow why the server must have access to that private key.
We can't even confirm if the key being supplied from the server is cryptographically secure let alone physically secure
@Xenoamor it's not a keypair, but a shared secret, effectively being used by the server to send itself messages via the bluetooth broadcasts received on people's phones. As a result, the server is trusting devices, not the other way around.
The threat model is not the central server faking data for its own consumption. The threat model is a malicious attacker (e.g. a criminal or state actor) who gains read access to the central server's database.
As a relatively benign example: suppose that I compromise the central server and thereby gain access to your Installation ID and Symmetric Key. I can then impersonate you. I can wander around transmitting Bluetooth messages that will be recorded by other app users. The central server will treat every single one of my forged messages as legitimately having been sent by you. There is no way for it to know otherwise.
I am going to raise this point at https://hackerone.com/nhscovid19app. Would you mind contributing there too? I ask because although understood your point generally, to be honest you illustrated the technical details better than I can.
@mcb30 it's explicitly described in the documentation as a symmetric key. Someone would need to read only particular lines of code out of context to believe the server doesn't know it. In other protocols, a shared HMAC is used to claim plausible deniability.
@marksteward Then let's see an official government statement acknowledging in writing that the protocol is explicitly designed to allow message forgery (which is precisely the same thing as "plausible deniability"), and that any recorded messages have zero evidentiary value and are inadmissible in court.
It's also not as simple a change as it sounds. The server still needs to authenticate messages, so a signature will add 512 bits to the 1024-bit payload. To prevent an attacker reusing a signature, BV would need to be re-signed more frequently than every 24 hours, which consumes battery. The new public key must be protected to prevent tracking, but must also be revealed to dispute the forged signatures.
Then choose a signature scheme that doesn't require the ability to verify the message signature before it becomes necessary to do so!
CX solves the equivalent problem by having the transmitting device retain its secret until an explicit disclosure event takes place (e.g. an infected person choosing to disclose the RNG seed from which their messages were generated), by deriving the initial secret in a way that allows for proof of ownership, and by ceasing transmission from that secret as soon as disclosure takes place. Read sections 4 and 5 of the specification document for the implementation details.
There is simply no need for the server to be able to verify messages prior to the disclosure event. You don't need to (and should not) choose a signature scheme that requires this unnecessary ability.
The risk to the public from abusing data should be tackled by legislation, as demanded by JCHR https://committees.parliament.uk/committee/93/human-rights-joint-committee/news/146351/report-on-the-contact-tracing-app-published/
I don't disagree that legislative protection is important, but that doesn't remove the need for technical protection. We have the Computer Misuse Act that makes unauthorised access a criminal offence and therefore gives us legislative protection, but we still use passwords.
I am going to raise this point at https://hackerone.com/nhscovid19app. Would you mind contributing there too? I ask because although understood your point generally, to be honest you illustrated the technical details better than I can.
@adarrel753 Sure; just CC me on whatever you submit. You can find my e-mail address in many git commit logs :slightly_smiling_face:
I am going to raise this point at https://hackerone.com/nhscovid19app. Would you mind contributing there too? I ask because although understood your point generally, to be honest you illustrated the technical details better than I can.
@adarrel753 Sure; just CC me on whatever you submit. You can find my e-mail address in many git commit logs
Nice one. Here you go: https://hackerone.com/reports/870145
Nice one. Here you go: https://hackerone.com/reports/870145
Thank you. I get an "Oops! You can't access this report because it isn't public yet" error. Could you add me to the authorised user list (or whatever action is required by HackerOne)? Thanks.
Please keep this thread updated. You can't read the other site without logging in.
Haven't heard anything yet to suggest that NHSX is planning to fix this security vulnerability.
Sounds like most of it is being canned in favour of moving to the Google/Apple system, that's probably why.
I've not heard news of that. Was this in the roadmap leak from yesterday?
It's been reported, e.g. today: https://www.theguardian.com/world/2020/may/14/nhs-coronavirus-advisory-board-split-over-ditching-government-app
Lots of silence around tihs fairly fundamental design flaw, especially considering the timescales in question...
I don't think it's a big problem, most database systems I have worked with have several layers of security in place anyway, so that one part can be taken down temporarily for maintenance without jeopardising the whole system.
By that argument, why not store everything in plaintext?
That would be a much bigger problem. Lol
I don't think it's a big problem, most database systems I have worked with have several layers of security in place anyway, so that one part can be taken down temporarily for maintenance without jeopardising the whole system.
This is the only layer of security in terms of asserting authenticity of the messages, and it's broken by design.
I just dont see how this security flaw represents a serious risk to its end-users
I already described the threat in my comment above: https://github.com/nhsx/COVID-19-app-Android-BETA/issues/14#issuecomment-626193757
Lots of silence around tihs fairly fundamental design flaw, especially considering the timescales in question...
It's not a flaw, @mcb30 is conflating the design of a decentralised protocol (like CX) with a centralised one. In particular:
There is simply no need for the server to be able to verify messages prior to the disclosure event. You don't need to (and should not) choose a signature scheme that requires this unnecessary ability.
In a centralised model, proximity events are uploaded on disclosure. If the server can't verify these on its own, it must reach out to each contact, stalling the notification cascade that's a central goal of this design.
As an aside, the CX protocol allows anyone (including authorities) to forge proximity messages after disclosure. It could fix this by including a signature in every broadcast, but I suspect it's also by design.
As an aside, the CX protocol allows anyone (including authorities) to forge proximity messages after disclosure. It could fix this by including a signature in every broadcast, but I suspect it's also by design.
@marksteward I think you've misunderstood what "disclosure" means. It's not the broadcast of the Bluetooth packet: it's the decision to disclose some information that was previously kept secret (the Seed Value, in the case of CX). Section 5 of the CX spec (https://github.com/ipxe/cx/releases/download/v0.10.1/cx.pdf) clearly states:
The act of sending a Seed Report is a deliberate disclosure of a set of Preseed Values and Seed Values. Once the Seed Report has been sent, the included Preseed Values (and the derived Seed Values) are no longer private to the Advertiser. An Advertiser MUST NOT transmit any Contact Identifiers generated by a Generator instance once the Seed Report corresponding to that Generator’s Seed Value has been sent. An Advertiser SHOULD immediately construct a new Seed Value as described in section 4 and continue by transmitting Contact Identifiers generated from the new Seed Value.
Once this disclosure has taken place (i.e. once anyone other than the device owner knows the Seed Value) then the Seed Value is exhausted: any further transmissions derived from that Seed Value will be ignored by all devices.
I like the idea of the seed value, it would mean you're basically unknown to the decentralised system until you reveal the value. Doesn't the current system issue you a value every day or something so you're identifiable in other people's data before reporting (to the decentralised server)?
I don't see why this can't be used with a decentralised system. The user still has to self report symptoms/test results and the seed can be shared at that point. Seems near identical to how it currently works
For those that haven't already seen it, there's an NCSC blog post about the security questions raised.
https://www.ncsc.gov.uk/blog-post/nhs-covid-19-app-security-two-weeks-on
@edent Thanks for the link. Good to see that this problem is at least recognised in the report at https://github.com/vteague/contactTracing/blob/master/blog/2020-05-19UKContactTracing.md#InstallationID-HMAC
The report then says that "NCSC have confirmed that there were existing plans to refine the registration process". Do you have any information about the new registration process that will be used?
The general logic that MACs are plausibly deniable is sound, but I still don't think the server needs to know the signing key.
Unless I've misunderstood: The purpose of signing the bluetooth transmissions is to prevent replay attacks. The harms of a replay attack seem to mostly fall on the user: they may falsely believe they have been in proximity to someone with COVID-19. Hence, only the user needs to know the HMAC signing key. The server can send them any proximity events that matches their ID, along with the associated MACs; the user's app then checks the MACs, and ignores them if they don't match their MAC signing key. If the app generates a fresh MAC signing key every 24 hours, and keeps 2 weeks of these keys, then it provides authenticity to the user while also maintaining complete deniability: since no one but the user's app knows the MAC key, then the MAC itself is meaningless noise once the user deletes the key.
Why does the server need to authenticate proximity events? Besides a DDoS attack on the user by flooding the the server with fake proximity events, which it will forward to the user's app, which must process and discard them.
This also means that a contact event is deniable, even to the server, unless the user decides to admit that a contact event is genuine. This adds another type of privacy.
I'm pasting this message in every active GitHub issue, so you may receive duplicate notifications.
Today, I'm happy to announce that NHSX has released the full git commit history for the Isle of Wight Beta apps.
As discussed, we have redacted API keys, sensitive domain names, and some of the developers' personal details. I am still waiting on final approval to publish the server-side code.
I would like to personally thank the community for your comments, bug reports, and vulnerability disclosures. They all went into helping the development process.
The beta trial of this app has now ended and we've moved to the next phase of app development. It is our intention to publish the source code of future apps as the binaries are released to the public.
Once again, thank you for being part of this.
Terence Eden Head of Open Technology - NHSX
Describe the bug From examining https://github.com/nhsx/COVID-19-app-Android-BETA/blob/master/app/src/main/java/uk/nhs/nhsx/sonar/android/app/registration/ResidentApi.kt#L46 it appears that the app relies on an external HTTP server to generate a key pair, which it then stores:
Since this key pair was generated by an external service the "private" key is, by definition, not private.
This implementation flaw is separate from the basic design flaw of any centralised approach to contact tracing. The basic design flaw allows a government to trace the movements and meetings of its citizens. This implementation flaw additionally allows the government to forge records of such movements and meetings, and to create valid digital signatures for the forged records.