Federated Timesheets Milestone 6: Digital Signatures

Initial demo of how digital signatures can be used to sign timesheets by the worker, then timestamped by the server, then provenance verified by the final data recipient if digital signatures are found attached to the data received. This demo would not be production-ready, nor would it be implemented in all three systems from milestones 1), 2), and 3). It would mainly be meant as a first experiment, to document what would be needed in terms of public key discovery, possible implementation hurdles for digital signatures in various programming languages, and protocol design including data canonicalisation, to make all three systems aware of signatures in a possible second phase of this project.

abstract

Building on the work done in Milestone 2b: Export timesheets data and the Signed Journal Entries Prototype, we have now allowed for outbound connectors using HTTP to sign messages using the HTTP Message Signatures IETF standard draft. This provides for a verifiable chain of trust from the user who enters timesheet data to messages sent to connected systems in the federation – and therefore, supports verifiability of data elsewhere in the federation.

design

The application chosen for this prototype is timeld, because it already uses asymmetric cryptography for signing of user 'operations', according to the m-ld security traceability design. User operations against timesheets happen in the timeld Command-Line Interface, and are propagated to any other 'clones' of the timesheet. This includes the timeld Gateway, a service designed for deployment to the cloud, or to any server location.

This means that the Gateway is able to verify user operation signatures as they arrive, subject to the trust inherent in timeld's security model. However, signatures on operations are not suitable to be propagated to other federated systems, for the following reasons.

First, timeld does not integrate with third-party identity models or use a federated identity model. Instead, it manages users and their in-app credentials within its own architecture. This is in common with many other cloud and enterprise applications. It would be possible to change this (and in future we are likely to), but for the purpose of this work we want to develop a federation model that is suitable for the majority of existing systems without forcing changes to their design.

The consequence of this choice is that user public keys are not available to other federation members through any supported mechanism; and therefore it is not possible for them to verify any signatures provided.

The second problem is that user signatures are applied to m-ld operations, which are part of its internal data synchronisation protocol. Therefore to decipher what an operation means requires intimate knowledge of m-ld. This is particularly distasteful because timeld provides data to other federated systems in their native format – a design principle established in the Federated Timesheet project's research. To require that these systems verify messages according to timeld's internal choice of protocol would cancel the advantage of this principle.

We considered the following solutions to this problem.

user signatures

With this option, data packets in target system format are signed by the user. Since the user's private key is only available to use on their personal devices, the data must be either created on their device or round-tripped to there for signing.

In the case that the data is created on the user device in the target system format, every federated system connector must be represented in the client code, probably as a plugin. Since connectors are typically very light on compute this is not likely to have any beneficial effect on distributing load; and it significantly complicates keeping clients up to date, as federation membership changes.

If data is created on the server and round-tripped to the client for signing, this involves network calls which are subject to the usual failure modes; these must be handled without loss of data. The calls are also characteristically server requests; which are not as well supported in the network application layer as client requests.

To allow for offline use, with both options is is necessary to establish queues; of data packets on the client, or of signing requests on the server, respectively. This complicates the distributed system.

For these complexity reasons, we rejected the option to have the user sign target system data packets.

certified translation

In electronic commerce and government, documents often require language translation – in many cases this is mandatory for acceptability. The translation may also need to be certified by the translator or an independent body. Adopting this approach, each target system 'translation' of a user operation on the data could be signed by the server, in this case the timeld Gateway.

In this model the Gateway is acting as a self-certifying translator, of user operations into target system messages. This must be accepted as part of the federation trust model; noting that in our current system the Gateway is also trusted as a data store and identity manager. If necessary, the role of certified translator could be moved to independent service components having the requisite trust.

While this model does not require the user keys to be available to the server, it does require an auditor to have access to the original untranslated user operations, which must still be verifiable. In timeld, this traceability is conveniently provided by the two already-developed features of the system:

User operations in native m-ld format are signed by the user and verified by the Gateway.
User operations are logged by the Gateway, to an independent logging service (if configured).

Again, we find that the Gateway must be trusted to verify user signatures and to log the operations correctly.

Generalising this model for federated bookkeeping:

Federated systems provide data to other systems in the target system format, signed by the system (not necessarily the user).
Federated systems must keep logs of user operations, for audit.

We chose this option to demonstrate in this milestone.

verifiable translation

A further refinement of the translation option would be for there to exist translation source code, and for the application of this code to be verifiable independently of the original translator. There is prior art for such verifiable code, in declarative encodings like XSLT, and blockchain smart contracts.

We note that in the case of timeld, this option is greatly complicated by the algorithms applied to user operations by the m-ld engine itself, which take as input not only the user operation but the pre-existing state. This is discussed in the traceability design. This will be much improved by the standardisation of the m-ld protocol; but still, the best way to ensure the correct operation of a m-ld engine will most likely always be to have a clone of the data in a trusted compute environment.

As we are satisfied with the trust model proposed in the above 'certified translation' option, we present this option as an idea for future research.

implementation

Since messages are pushed to target federated systems using HTTP POST, we chose to sign these messages using the HTTP Message Signatures IETF standard draft. This allows for not only signing of the message body, but also contextual information such as the URL and content type.

In accordance with best practices for key management, the Gateway's key pair for digital signing is not the same as used for its HTTPS API. It is made available to the server process via environment variables, process arguments or a configuration file. For the convenience of service deployers, we provide a new utility for generating the key pair (packages/gateway/genkey.mjs). For verification, the public key is accessible publicly in PEM format using a new REST end-point, /publicKey.

Target system messages are created in plug-ins deployed in the server process – and so are trusted (for example, not to poke holes in the file system). Therefore we allow the plugin to apply a Gateway signature to any message they produce.

However, in order to provide a consistent approach to signature content, we provide the plugin with a callback function to sign an HTTP message, signHttp (in packages/gateway/lib/Connector.mjs). This function hashes the provided message body with a Multi-Hash digest, using the http-digest-header npm library. The body digest, along with the HTTP method (usually POST), request target (URL path & query), and Content Type are then signed using the http-message-signatures npm library. This library not only creates a cryptographic signature, it also applies the requisite HTTP headers for signature verification (see below).

We applied the use of signHttp to the Prejournal connector, in packages/prejournal/index.mjs. The messages produced remain compatible with the current implementation of Prejournal. In a future implementation, Prejournal will be able to use the timeld Gateway public key to verify the signature.

verification

To verify the signatures created by the timeld Gateway, we used our existing mock Prejournal service already available for testing the timeld-prejournal connector (packages/prejournal/test/mockPrejournalService.mjs), to print out the HTTP POST request, for example:

POST /v1/worked-hours
X-State-ID: http://georges-imac.local/test/ts1?tick=16
Content-Type: application/json
Authorization: Basic <redacted>
Content-Digest: mh=uEiBsYW1Sz8JiBWleI99_YFMlLUycr_YqD3Kh9s9-uPbRvA
Signature: sig1=:L3jNN+HPIrd6sy7sF9X7x5iEPA+XNep7I1LOTG2zZ2LlC1Yvc+G6vcgICiD90WAmudGU+V+4/d315PvuMNzUwQDP+19OH5qV+eQ34G9KSRfFiPpJW85D5ZDhEpfG1C8Ppdm19V49Tp3Yr0W9FQfon2cgaqealOn/WCjTu/QRHTc=:
Signature-Input: sig1=("@method" "@request-target" "content-type" "content-digest");created=1669639858;keyid="Wb54CQ";alg="rsa-v1_5-sha256"
host: localhost:56800
Accept: */*
Content-Length: 128
User-Agent: node-fetch/1.0 (+https://github.com/bitinn/node-fetch)
Accept-Encoding: gzip,deflate
Connection: keep-alive

["2022-11-28T12:50:58.285Z","Federated Timesheets Virtual Organisation","test/ts1",0,"testing","http://georges-imac.local/test"]

It is possible to validate the signatures attached to this message using the HTTP Message Signatures validation playground at https://httpsig.org/. The above message was generated using a server having the following public key:

The signature is verified successfully.

Note that unit testing has also been included for the signHttp callback, but since the http-message-signatures library is not able to verify signatures, the test checks for a verbatim signature given a fixed input.

The following entry is found in the Gateway log. Note that:

The HTTP request can be correlated with the AUDIT log entry via the X-State-ID header, which identifies the timesheet and current m-ld clone state tick.
It's possible to configure the Gateway to push AUDIT log entries to a separate auditing system.

AUDIT {"gateway":"georges-imac.local","account":"test","name":"ts1","update":{"@delete":[],"@insert":[{"@id":"clb0sghc30000vocygxyganv7/1","@type":"Entry","session":{"@id":"clb0sghc30000vocygxyganv7"},"activity":"testing","vf:provider":{"@id":"http://georges-imac.local/test"},"start":{"@value":"2022-11-28T12:50:58.285Z","@type":"http://www.w3.org/2001/XMLSchema#dateTime"}},{"@id":"clb0sghc30000vocygxyganv7","@type":"Session","start":{"@value":"2022-11-28T12:50:50.084Z","@type":"http://www.w3.org/2001/XMLSchema#dateTime"}}],"@principal":{"@id":"http://georges-imac.local/test"},"@ticks":16}}

analysis

In the earlier parts of this project we established a federation approach, as described in the project home readme. The trust model of this approach is, effectively:

Users trust the system they are using – to the same extent as if they were using it in isolation.
When viewing data that originated in another system, they are implicitly trusting the integrity of that other system and the connector between them (including any network calls).
Further, the data may have been forwarded through more than one system, each of which must be trusted to correctly re-represent the data while forwarding.

Note in particular that there is no consistent model for non-repudiation; which is to say, at present systems can deny that they did in fact send a particular message. Based on our insistence on system sovereignty, there is no recourse to any authority to repudiate such a denial.

The use of message signatures proposed and implemented in this milestone, according to the design option above, means that messages arriving from another system can be verified to have arrived unmodified and to have arrived from a specific system, having a public key. In addition, we require as a condition of federation membership, that systems maintain an audit log which correlates messages to user actions, or to incoming messages. Systems therefore act as message 'translators' – but not requiring an undue level of trust, because their actions can be independently audited.

Having conducted this research, we make the following observations and recommendations for future directions in refining the security model for federated bookkeeping.

key rotation: All federation systems should be able to rotate their keys on a regular basis. Since this requires changing both keys of a key pair, this incurs the possibility that a system may have changed its key prior to an audit, making it difficult verify older messages. When using Public Key Infrastructure, this problem is solved by the reliance on a trusted authority with long-lived keys. A more decentralised scheme may be preferable for federated bookkeeping.
audit log retention: In a realistic system it there must be well-defined rules for audit log retention, and these must be specified in the federation. The baseline for such rules can probably only be established from experience.
log federation: In our model, in order to reliably trace the routing history of a message it may be necessary for an auditor to traverse several systems. It would help if the logs could be inspected in a merged view. One approach would be to federate the logs themselves; but this would risk a recursion – how to validate that the federated log is itself correct?
identity management: We have deliberately made no attempt to federate user identities in our model. Doing so could improve the overall trust in the system, especially if user public keys were made available, so that it is not necessary to trust individual systems to correctly represent user actions in audit logs, as action signatures can be independently verified.

m-ld / timeld

Connector HTTP signing #96