Closed lukpueh closed 3 years ago
A very basic implementation could look something like this:
# in securesystemslib
class Signer:
@abc.abstractmethod
def sign(bytes: payload) -> Signature:
raise NotImplementedError
class SSlibSigner(Signer):
def __init__(self, sslib_private_key):
self.key = sslib_private_key
def sign(self, payload):
return sslib_keys.create_signature(self.key, payload)
class GPGSigner(Signer):
def __init__(self, gpg_keyid):
self.keyid = gpg_keyid
def sign(self, payload):
return sslib_gpg.create_signature(payload, self.keyid)
# in TUF
class Metadata:
def sign(sslib.Signer: signer):
signer.sign(self.signed.to_canonical_bytes())
# in Warehouse
class VaultSigner(Signer):
def sign(self, payload):
# ... you get the idea
metadata = tuf.Metadata(...)
metadata.sign(VaultSigner(...))
class VaultSigner(Signer):
Great idea. I would add only that it might be useful to add in SSLib itself...
I would add only that it might be useful to add in SSLib itself...
Agreed.
Thank you for the detailed issue @lukpueh.
Bit of an aside, but do you imagine the GPG and HSM signing interfaces in securesystemslib becoming implementations of the interface we create here? That seems logical to me, but I'm not familiar with the GPG and HSM signing interfaces.
Bit of an aside, but do you imagine the GPG and HSM signing interfaces in securesystemslib becoming implementations of the interface we create here?
I would suggest so, yes. Do you have reservations?
No reservations, just checking whether my expectation was reasonable. Thank you.
I'd like to add some ontological thoughts here. In particular, whether the Signer
class could instead be a SigningKey
class, given that a Signer
, as described above, is associated with exactly one key (or key identifier) anyway.
What speaks against combining Signer and Key in one class: In pure modeling terms a key data container and an entity that implements a signing protocol are quite different things. Also, a private key might be used with different signers (protocols, algorithms, schemes, etc.), which would require duplicating key data on multiple signers.
What speaks for it:
securesystemslib
needs private keys only for their signing capabilities and it is unlikely that any given private key will be used with different signers during its in-memory life time.
Outlook Maybe we should at least tentatively explore the expected use cases for private keys in a new repository library (#1136).
I have already summarized the main tasks for keys (private and public!) in https://github.com/secure-systems-lab/securesystemslib/issues/310, albeit with a focus on public keys, which need to implement serialization to and deserialization from TUF metadata format (https://github.com/secure-systems-lab/securesystemslib/issues/308) and verification. Both may or may not be tied to the protocol that also implements the signing.
@trishankatdatadog, maybe you can share your experience from implementing tuf-on-a-plane?
What speaks against combining Signer and Key in one class: In pure modeling terms a key data container and an entity that implements a signing protocol are quite different things. Also, a private key might be used with different signers (protocols, algorithms, schemes, etc.), which would require duplicating key data on multiple signers.
I am asking myself does the Key
class has meaning outside the signing operations? If not, then probably it makes sense to keep it there?
Also, I am not sure if there is a modeling contradiction. SignerKey
will be a container class for key data and at the same time that class would have functionality only related to that data, so they are already in a sense strongly connected.
I am not sure if we have a SignerKey
for the signing operations, then how we are going to do the verification operations?
There is no sense to have a separate VerifierKey
class.
Maybe we can look the other way around and have a Key
class with signature and verification functionality?
This seems more logical to me.
@trishankatdatadog, maybe you can share your experience from implementing tuf-on-a-plane?
It's a good question. I didn't implement any signing interface, only verification interfaces. But, if it helps, what I found is that rather than designing on pen and paper, writing, trying, and feeling the code worked out very well. So try writing what comes naturally, and see if it makes sense? I know I sound like a mystic, but that's what worked for me.
Thanks for your comments, @MVrachev and @trishankatdatadog!
SignerKey will be a container class for key data and at the same time that class would have functionality only related to that data, so they are already in a sense strongly connected.
Yes they are, but there are some variables that change for the signer but not for the key, such as padding, hashing, scheme, etc. To me it feels a bit odd to modify the key object, in order to use a different signing scheme, e.g.:
k = SigningKey(...) # Initialize once e.g. with RSA secret exponent
k.scheme = "rsassa-pss-sha256"
k.sign(data) # Sign using one scheme
k.scheme = "rsa-pkcs1v15-sha512"
k.sign(data) # Sign using another scheme
But maybe switching schemes during life time is just not a use case, and even then, the odd looks alone don't seem to be a strong argument against it.
An alternative would be to accept these variables as a parameters to SigningKey.sign
. But given that our Metadata.sign
will call the sign
method of whichever passed SigningKey
subclass, the interface needs to be generic for all key types and as simple as possible.
Maybe we can look the other way around and have a Key class with signature and verification functionality?
Yes, but I think there is an argument for separating public and private key in the model. If they were combined, we'd often have half-empty objects, e.g. on the client where only the public parts are available. And even in the signing context we only really need the private portion.
Furthermore, given that public keys are included in TUF metadata their class representation should look similar to the metadata representation for recognizability and have a focus on type/schema checking and serialization tasks. Whereas we care a lot less about what the private key class looks like as long as it can be used for signing.
Does this make sense? I have a feeling that hands-on @trishankatdatadog might say, I'm overthinking this. :P
Does this make sense? I have a feeling that hands-on @trishankatdatadog might say, I'm overthinking this. :P
This requires some meditation TBH. There are generic cryptographic algorithms and then there are specific schemes. What code do we expect delegators and delegatees to write? I would approach it that way.
Does this make sense? I have a feeling that hands-on @trishankatdatadog might say, I'm overthinking this. :P
It makes sense and those are valid points if you ask me.
After all of your thoughts, I feel like the best option is to have a Key
class in tuf/metadata/api
representing exactly what is written in the spec and a different Signer
class which contains a Key class instance, basically the way you initially proposed it.
That way you have a model literally representing a key the way it's defined in the spec, you can easily change the key schema, won't have the problem with half-empty objects, and have a logical encapsulation for each of the classes.
Additional thinks we discussed with @jku:
Signature
be or the return type of the sign()
operation?Jussi looked into what is the returned value of some of the possible implementors of the Signer
interface:
Default implementation
: return value of keys.create_signature() is a dict of keyid and bytesGpg
: Return value is also a dict: it is used to return multiple other pieces of data as well -- it is unclear if these are actually useful to create_signature() callers or only for intermediate processing before thatHSM
: Returns a dict like the generic keys.create_signature()Vault
: Vault signature is vault:<key-version-number>:<base64-encoded-signature>
-- should be usable as bytes
https://www.vaultproject.io/api/secret/transit#sign-data
https://hvac.readthedocs.io/en/stable/usage/secrets_engines/transit.html#sign-dataSo, to summarize it seems achievable to return bytes
instead of a custom Signature
object which is unclear what should it be.
So, to summarize it seems achievable to return bytes instead of a custom Signature object which is unclear what should it be.
That does make sense. But then every SignerKey
must also expose a keyid. Because the caller is likely to store the signature bytes together with a keyid, otherwise the signature can't be mapped to a public key for later verification.
class Metadata:
...
def sign(self, signer):
# NOTE: Let's assume signatures are still in the old/current dictionary format
self.signatures.append({
"keyid": signer.keyid,
"sig": signer.sign(self.signed.to_canonical_bytes())
})
I think your proposal makes the SignerKey.sign
interface a bit cleaner, but the overall SignerKey
class a bit less blackboxy.
Gpg: Return value is also a dict: it is used to return multiple other pieces of data as well -- it is unclear if these are actually useful to create_signature() callers or only for intermediate processing before that
Yeah, unfortunately these "other_headers" are required for verification. Although it might be possible to return them as part of the signature bytes and then parse upon verification.
EDIT: FYI, this is where we parse GPG signature packets.
...custom fields like the one on gpg signatures might also be an argument for SigningKey.sign
returning a complex signature object.
It makes sense and those are valid points if you ask me. After all of your thoughts, I feel like the best option is to have a
Key
class intuf/metadata/api
representing exactly what is written in the spec and a differentSigner
class which contains a Key class instance, basically the way you initially proposed it. That way you have a model literally representing a key the way it's defined in the spec, you can easily change the key schema, won't have the problem with half-empty objects, and have a logical encapsulation for each of the classes.
It seems I misunderstand @lukpueh. Lukas seems to mean to combine Signer
and the private part from the key.
I am only not sure about the SignerKey
name. It feels a strange concatenation of the functionality it provides - signing and the data it stores - key data. Probably it makes sense just to call it Signer
as it was originally proposed and it will have attributes representing key data.
That does make sense. But then every
SignerKey
must also expose a keyid. Because the caller is likely to store the signature bytes together with a keyid, otherwise the signature can't be mapped to a public key for later verification.
You have a point here. We can indeed create a Signature
class and store there both the bytes signature and keyid. Then we would return an instance of it when calling sign()
.
We can also use this Signature
to add functionality connected to the signature.
Like verification operations?
Probably it makes sense just to call it Signer as it was originally proposed and it will have attributes representing key data.
I'm fine with Signer
. :)
We can indeed create a Signature class and store there both the bytes signature and keyid. Then we would return an instance of it when calling sign().
SGTM! :)
We can also use this Signature to add functionality connected to the signature. Like verification operations?
I'm a bit unsure about that. I think in most cases sign
can just return a generic Signature
with a keyid and some raw signature bytes, independently of the key type, scheme, etc. used to create the signature. If we implement verify
on the Signature
class it has to account for all those variables, or, more likely, we'd use different Signature
subclasses for different verify
implementations.
As a consequence a concrete Signer
subclass implementation must know which concrete Signature
subclass to return, in order to provide the correct verify
method. I don't know if we want to put that extra responsibility on the Signer
. To me, this feels more like a task for the public key.
I'm a bit unsure about that. I think in most cases
sign
can just return a genericSignature
with a keyid and some raw signature bytes, independently of the key type, scheme, etc. used to create the signature. If we implementverify
on theSignature
class it has to account for all those variables, or, more likely, we'd use differentSignature
subclasses for differentverify
implementations.As a consequence a concrete
Signer
subclass implementation must know which concreteSignature
subclass to return, in order to provide the correctverify
method. I don't know if we want to put that extra responsibility on theSigner
. To me, this feels more like a task for the public key.
Yes, you are right. Those arguments are logical.
Today we had a quick discussion with @jku about the Signer
return type and he shared some concerns if we return our own Signature
type.
Jussi, can you please add your opinion to the thread?
As a consequence a concrete
Signer
subclass implementation must know which concreteSignature
subclass to return, in order to provide the correctverify
method. I don't know if we want to put that extra responsibility on theSigner
. To me, this feels more like a task for the public key.Yes, you are right. Those arguments are logical.
Today we had a quick discussion with @jku about the
Signer
return type and he shared some concerns if we return our ownSignature
type.
I think it was mostly that it seemed like an object isn't needed in this case at all (I'm assuming that usually when you sign something you are doing that to create metadata for the purpose of serializing it: an array of bytes is fine for that)?
Also I don't know how the opposite case (verifying portions of metadata while de-serializing it) is going to look like and that seemed more relevant for a potential Signature object? As in how is Metadata API used with e.g. Vault in that case?
@jku, this is roughly how canonicalization, signing/verifying and de/serialization currently play together:
# On the server
1. create canonical bytes for payload
2. create signature over canonical bytes
3. serialize non-canonical payload + signature(s)
# On the client
1. deserialize non-canonical payload + signature(s)
2. create canonical bytes for payload
3. verify signature over canonical bytes
Note that this might change in the future. Currently we are discussing an alternative protocol, which does not require parsing the full payload before verifying the signature (see Secure Systems Lab signing specification).
Especially, in the light of that new proposal and the likely short-livedness of signature objects I agree with you that a class model might not be needed.
However, I think a class helps to clearly define the required signature format, even if it only consists of two fields (keyid and signature bytes). And it also aligns with our decision in ADR 4 to create classes for all complex metadata attributes, e.g. in order to add self-validation behavior.
I have pushed the initial version of the signing interface and the necessary TUF changes to support it in the prs linked just above this. comment.
While working on SSlibSigner and GPGSigner I realized that if we want to store the necessary information to verify the signatures with securesystemslib.kets.verify_signature()
and securesystemslib.gpg.functions.verify_signature()
we would need to add more key metadata in Signature.
For the SSlibSigner, we would need to add the key_dict: https://github.com/secure-systems-lab/securesystemslib/blob/dff4425e5663c58c954447a698efb17c4b23b0f8/securesystemslib/keys.py#L738
and for the GPGSigner, we would need to add the pubkey info
https://github.com/secure-systems-lab/securesystemslib/blob/dff4425e5663c58c954447a698efb17c4b23b0f8/securesystemslib/gpg/functions.py#L170
the way we are doing this in the tests: https://github.com/secure-systems-lab/securesystemslib/blob/dff4425e5663c58c954447a698efb17c4b23b0f8/tests/test_gpg.py#L571
Because of the above problem, I am not sure if I can create a relationship between the Signature class and the GPGSignature.
Before, I thought I can just inherit the Signature
class used by the SSLibSigner and add specific fields related to GPG, but I am not that sure anymore.
With the observations from the above comment, I made the Signature
class like this:
the problem is that for GPGSignature I won't need the self.keydict = keydict
field or it will be totally different.
The pubkey info
could be one of three other schemas
and each one of them with its own specifics...
The logical solution will be to create a separate Signature
interface implemented by the SSlibSignature
and GPGSignature
, but do we want yet another interface?
What should we do from here? @lukpueh?
@MVrachev, I thought you agreed with my arguments above for not implementing verification as method of the Signature
class? (And even if we did, I wouldn't store the public key on the Signature
object but rather pass it to the verification
method as argument.)
Yes, we agreed that we would do the verification elsewhere.
I just thought we want to store all of the information needed for the signer verification in the Signer class.
We discussed that with Lukas and we would store only the keyid
and the signature.
The keyid
will give us the necessary additional information for the verification.
I will add a GPGSigner
in another pr, but I decided to mark this issue as fixed
in my pr #319 because it fixed the main point of this issue - creating a Signer
interface and a securesystemslib implementation of it.
Description of issue or feature request: Allow TUF integrators to easily use custom metadata signing implementations instead of the currently supported securesystemslib one.
An urgent use case for this is the PEP458/Warehouse integration, where metadata must be signed with keys stored in a Hashicorp Vault and the implementation is already available.
NOTE: This feature request only concerns the
tuf.api.Metadata.sign
method and not the legacy signing facilitieswrite
andwriteall
in repository tool.Current behavior:
tuf.api.Metadata.sign(key, ...)
callssecuresystemslib.keys.create_signature
with the passed sslib/json-formatted private key.Expected behavior:
tuf.api.Metadata.sign
takes an abstract signer parameter, which implements signing. A default signer may resolve to the currently usedsecuresystemslib.keys.create_signature
.Implementation considerations:
Different signers require completely different "keys", e.g. the currently used generic implementation requires an sslib/json-formatted private key, whereas signing implementations that don't have direct access to the key may only need a keyid (see GPG or HSM references below). As a consequence the key parameter in
Metadata.sign
should probably be encapsulated within the signer.The abstract signer interface must strictly define the format of the returned signature for metadata interoperability.
The abstract signer interface design should also keep in mind other key-related tasks, such as signature verification, public key export to sslib metadata format, and keyid generation.
securesystemslib.keys.create_signature
is already an abstract interface, where the json-formatted private key parameter format is generic enough to hold different key types and to choose one of many supported signing algorithms, identified by the key'sscheme
field. (see public key metadata format reference below). However, it is not suited to extend to custom implementations. A new abstract signer interface should integrate well with the existing securesystemslib infrastructure and/or aim to replace it. It should not become yet another co-existing interface and/or layer of abstraction (see public API overview reference below).Related work and references: (for the very ambitious reader)
securesystemslib key overview (to coordinate this feature request with the existing infrastructure)
Recently added abstract filesystem interface (for design inspiration) -- https://github.com/theupdateframework/tuf/issues/1009
@trishankatdatadog's Hashicorp Vault interface POC (as reference for PEP458/warehouse implementation and for design inspiration) -- VaultKey
Abstract
pyca/cryptograhpy
key interface with sign method (for design inspiration) -- RSAPrivateKey"Historical" external signing API feature request (has some interesting discussion and designs but targets legacy TUF codebase) -- https://github.com/theupdateframework/tuf/issues/864
in-toto metadata model sign methods (alternative but less flexible approach to support different signing implementations)