Open ZmnSCPxj opened 5 years ago
Some more thoughts.
hsmtools
works by hacking into hsm_secret
, and in the future the hsm_secret
may now live in a separate, actual hardware module, as suggested here: https://lists.ozlabs.org/pipermail/c-lightning/2019-December/000172.html . Thus any crypto operation that requires access to the HSM secret should be added to the interface of hsmd
.Thanks @ZmnSCPxj for bringing this up, I quite like the idea of complementing
signmessage
/checkmessage
with an encryptmessage
/decryptmessage
variant. I think we can also reuse a lot of the ECDH + HKDF + ChaCha20
functionality from the sphinx onion packet construction. That'd mean we can
simply re-use the parameters used in sphinx (regarding algorithm and curve
selection) and re-use the hsmd
functionality to derive a key using ECDH and
the ephemeral key.
I definitely share your concern about having a plugin touch the hsm_secret
so it should be implemented as a JSON-RPC instead, and go through the hsmd
control to generate the shared secret used for decryption. I don't think the
entire thing needs to run in the hsmd
rather we just ask it to generate the
shared secret, and implement the remainder in lightningd
, just like we do
for the sphinx packet.
Unlike the sphinx construction, instead of the manually managed HMAC, I'd prefer to just use the commonly used combination ChaCha20-Poly1305 which includes the MAC internally. This should reduce the number of moving pieces we need to shuffle around.
lightning-encrypted-message-v0
?version | ciphertext | tag
if we decide not to allow custom AADs, otherwise we'd
need to hand that around as well.Hex-encoding is mildly undesirable as you have to push 2 bytes through the RPC for every byte you want to encode.... consider how sometimes we need, for the backup system, to occassionally "roll up" some queries and push them into a "real" database -- the "real" database is then have to be decrypted and re-encrypted on-disk. But we might not have a choice, so ---
Hex-encoding is mildly undesirable as you have to push 2 bytes through the RPC for every byte you want to encode.... consider how sometimes we need, for the backup system, to occassionally "roll up" some queries and push them into a "real" database -- the "real" database is then have to be decrypted and re-encrypted on-disk. But we might not have a choice, so ---
Agreed, hex-encoding should definitely not be the only way to encrypt for human readable text UTF-8 encoding is more than sufficient, and the same applies for uncompressed SQL queries. The problem arises when we have non-human readable input such as compressed SQL queries, arbitrary hashes, etc. I don't think we have much choice in those cases, but we can make it more efficient for human readable inputs.
Though I may be overthinking this: we could just hex-encode everything, i.e., treat everything as binary, and always decode on the RPC side. Maybe we should just start with this trivial case, and if we see people bumping against having to hex-encode even human-readable contents, we can add a secondary path that doesn't hex-decode on the RPC side.
My main concern actually is with encrypting binaries, such as the aforementioned "entire copy of the database", which has to be stored encrypted as well. How about using base64, which at least mildly reduces the size overhead relative to hex? Either that or exposing some kind of derivesymmetrickey
given the ephemeral point for the encrypted blob, so that the RPC at least is not overstressed with passing a large binary blob through (and implement asymmetric encryption/decryption as a plugin!)... But that can be later, I think implementing this is more important now than later.
I don't think that using base64 encoding is a substantial change to hex-encoding. We already have a large(-ish) number of encodings (hex, zbase32, base58, ...) and adding base64 would only give us about a 33% saving over hex anyway.
I do like your idea of doing a keys-only mode, in which we generate the shared secret and optionally a random ephemeral key, and then let the caller stream-encrypt the actual data. This way the actually encrypted data never has to fit in memory, and we don't have to pass it to lightningd
at all. In this case we'd need to make sure the hkdf
used to derive the shared secret is tagged with something that makes it uneligible for other uses.
I don't think that using base64 encoding is a substantial change to hex-encoding. We already have a large(-ish) number of encodings (hex, zbase32, base58, ...) and adding base64 would only give us about a 33% saving over hex anyway.
We already have base64 for tor irrc EDIT: ah maybe you mean "adding to use" and not "adding to the code"?
Adding to use I suppose.
Though with an exposed derivesymmetrickey [ephemeralkey]
we can move decryptblob
/ encryptblob
to a plugin, which is always cool because plugins are cool, plugins are love, plugins are life.
It seems to me that the minimum necessary inside lightningd
would be to expose ECDH on secp256k1 over the JSONRPC, to be used as the key agreement function, passing in an ephemeral pubkey, then leaving the rest of the ECIES (key derivation function, stream encryption) to the caller. The rest can be a plugin.
We already have hsm_ecdh_req
and hsm_ecdh_resp
messages to/from hsmd
, which uses the node pubkey as the point input to ECDH. As far as I can tell, the ECDH used is:
I suppose it is safe to expose this, but what it returns seems to implicitly have an HKDF as well..... ----- ? Well, I am no cryptographer, so...
Okay, I did a little research, and apparently the standard for ECDH is to take the X coordinate of the resulting point, and nothing else. What we do by default in libsecp256k1
is to prepend a 0x01 byte to the X coordinate, then hash ti with 256-bit SHA-2.
So I suggest the changes:
hsmd
and the rest of C-lightning to return just the X coordinate, not its hash.common/
module to request to hsmd
, and perform the hash of 0x01 followed by the X coordinate outside of the hsmd
.
This replaces the current ECDH request message.getecdh point
command.The rest (HKDF, stream crypto) can be done by a plugin, possibly built-in.
Also ping @ksedgwic and @devrandom as it changes the hsmd
interface and they are interested in it.
Thoughts? Objections?
Aaaaah no I completely misunderstood ecdh_hash_function_sha256
.
It encodes the resulting point in a DER-encoded point (i.e. 0x02
/ 0x03
followed by X coordinate). Then it hashes. Bleah. So ----
hsm_ecdh_req
to send the DER-encoded point as response.connectd
and channeld
, the two consumers of ECDH.getecdh point
that returns a DER-encoded point.Looking at the BOLT spec, it seems the sha256(der_encoding(priv * Point))
is considered part of the key agreement schema. However, looking at SECG SEC-1 section 3.3.1, it seems the ECHD schema it uses is (priv * Point).x
. There is also a "cofactor" variant to the ECDH in the SECG SEC-1 3.3.2, I have no idea what it is for, not being a cryptographer. To support both I guess we should just return der_encoding(priv * Point)
from the getecdh
command: to get the BOLT key agreement we sha256
it, to get the SECG SEC-1 key agreement we drop the first byte (because the DER encoding is just 0x02/0x04 depending on the sign of y followed by the x coord).
Crypto standardization is hard. Let's go shopping!
Propose addition of new command:
decryptblob
Decrypts a hex-encoded blob using an ephemeral public key and the node private key.
To encrypt:
To decrypt,
hsmd
will recover the key state by multiplying its node private key with the ephemeral public key, generating the same point which when hashed generates the same Chacha20 state.(exact details subject to change, we just need some method of asymmetric encryption, which is easiest done using symmetric encryption and ECDH, and which is difficult to hack with known-plaintext attack.)
Use case is for
db_write
hooks that need to store the database queries in a server that is only trusted to store the data (but not trusted to leak it to everybody else). Thedb_write
hook could encrypt on-the-fly before responding to thedb_write
hook.On recovery, we only need:
hsm_secret
We replace the
hsm_secret
with the recoveredhsm_secret
, then restartlightningd
with--offline
, thendecodeblob
each point-ciphertext pair and extract each database query on a fresh database. Then stop thelightningd
and replace its database.We may need to have a better way to push binaries through the RPC though.
Thoughts?