Open timbru opened 2 years ago
Things to consider:
There are three separate entity types to consider: which should be reported and what level of detail about each should be given in a high level status vs which should be exposed via a detailed status of a single signer endpoint?
1) the signer configuration in krill.conf
which has a "type", a specified or generated "name", and a status ("pending", "active" or "unusable"). If pending when did we last try to connect and what if any was the reason for failure. If unusable, why was it deemed to be unusable?
2) the signer mapper SignerInfo
details written to signers/<UUID>/snapshot.json
to which a given signer config block is "bound" (not by "name" as the signer config name is not required to be unique) when the config enters the "active" state, and which has an identity key and knows zero or more keys by their key identifier and to which signer backend specific internal key ID they map to.
3) the actual signer backend which may be software (OpenSSL, SoftHSM or PyKMIP for example) or hardware, and which has its own status, i.e. for a KMIP backend when did we last exchange a request/response with it and was the response success or failure and if a failure what kind of failure, we maybe even are interested in last status (success or failure) by signer operation (e.g. create key, sign, delete key, etc) or counts of fails/successes (per operation) or ...? this relates closely to the topic of what would we want to expose as Prometheus metrics (see #578 and #581). We may want to report not just the configured name for the signer or its UUID but and/instead its actual make and model and version number and connection properties (e.g. FQDN). For a long running Krill instance the config file signers don't necessarily match the signer(s) Krill is actually connecting to.
When exposing these entities and their properties in a public endpoint we then have to decide on good names for them, the names currently used in the code may not be the most intuitive to an end user.
Should querying the status also trigger an attempt to bind any pending signers whose backend is now ready? If not then the signers will remain unbound until first use if not already bound. Related to this: should there be an explicit command to "test" the connection to configured signers or does the status command by triggering a connection attempt fulfill this use case as well? (see #572).
There doesn't appear to be any freely available tooling to query both KMIP and PKCS#11 signers (and certainly not the OpenSSL on disk store Krill uses) and so it may be useful to have krillc
commands to list the set of keys (just those known to Krill, or just those with a Krill-like name, or all keys?) stored in the signer backend. (see #570)
The set of signer configs maybe equal to, disjoint or even completely non-overlapping with the set of signer mapper SignerInfo
stores. This is because a signer mapper SignerInfo
store can become "orphaned", that is to say that if all configured signers are "bound" to a signer mapper SignerInfo
store but there exists additional SignerInfo
stores that are not "bound". The keys known to those stores are then inaccessible to Krill, and were Krill to delete no longer relevant keys this would then imply that "active" needed keys are unreachable as no configured signer points to the backend that contains those keys. This could happen if the config for an actively used signer backend is removed from the config file and Krill is restarted. Do we want to report about such cases in a top level signer status endpoint?
Should we attempt in plain text output to construct a single table with columns for the entity types and rows for the signer configs/signer store signerinfo items showing how they relate to each other, or should we output separate tables/info blocks per entity type, or some other kind of report?
When outputting JSON (i.e. via the API or via krillc --format json
) what should the structure of that JSON be?
A very simple dumb starting point could be output of the form:
$ krillc signers status
Active signers:
- Fallback OpenSSL signer
Pending signers:
- Kryptus Cloud HSM
Unusable signers:
- My broken PKCS#11 configuration
And:
$ krillc signers status --format json
{
"signers": [
{
"name": "...",
"status": "(active|pending|unusable)",
"handle": "UUID", [1]
}
]
}
(the JSON here was modeled after the output of the CA list API endpoint)
[1] - The handle is only known when the signer is in the "active" state.
Add an endpoint for signer statuses, and perhaps key ids per signer etc.. for diagnostics