MrMabulous / GitTrustedTimestamps

Trusted Timestamping for git repositories using RFC3161 and RFC5816 token
https://matthias-buehlmann.medium.com/git-as-cryptographically-tamperproof-file-archive-using-chained-rfc3161-timestamps-ad15836b883?source=friends_link&sk=fb180a11ab53a2c2d9b31bcf0febf2fc
GNU Affero General Public License v3.0
35 stars 7 forks source link
cryptography hooks mabulous openssl rfc-3161 rfc-5816 rfc3161 rfc5816 secure-timestamps timestamp-tokens trusted-timestamping tsa x509

GitTrustedTimestamps

RFC3161 and RFC5816 Timestamping for git repositories.

By using this post-commit hook in a repository and thereby adding secure timestamps to the commits it contains, the repository gains the following properties:

How to use this software

These hooks were written for git 2.30 and newer (they may or may not work with older versions of git). If you do not use Git Bash, you need bash-4.3 or newer.

  1. (optional, but recommended) If you're ceating a new repository, it is strongly recommended to use SHA256 hashes (git uses SHA1 by default at the time of writing) by initializing the repository using git init --object-format=sha256 (Note: If you want to use a public hosting server such as github for your repository, you should check whether they already support SHA256 repositories). For more information, see https://git-scm.com/docs/hash-function-transition/
  2. Copy the four bash scripts in the hooks folder of this project into the .git/hooks folder of the project you want to timestamp.
  3. Configure the TSA url you want to use (in this example https://freetsa.org/tsr) using
    git config --local timestamping.tsa0.url https://freetsa.org/tsr
  4. You must declare that you trust this TSA by copying the root certificate of that TSA's trust chain into the .git/hooks/trustanchors folder (create it if it doesn't exist yet). The certificate MUST be in PEM format and the filename MUST be "subject_hash.0" wheresubject_hash is what openssl returns for the --subject_hash argument for x509 cetificates (https://www.openssl.org/docs/man1.1.1/man1/x509.html).
    For your convenience, there is the hooks/trust.sh script which will do this for you. To use it, simply run .git/hooks/trust.sh https://freetsa.org/tsr from your Git-Bash and confirm adding the certificate (Note: The certificates stored in .git/hooks/trustanchors are ONLY used to validate timestamp tokens, no other trust is established).
  5. (optional) If you want to use multiple TSAs, just set additional urls for tsa1, tsa2 and so on. Make sure that they are all defined consecutively, for example, if you have a url defined for tsa0 and for tsa2, but not for tsa1, then tsa2 will be ignored. Since timestamp tokens will become forever invalid should a TSA's private key leak, using at least two trusted TSAs is a good strategy to protect against this unlikely eventuality (see chapter 4. "Security Considerations" of RFC3161 specification).
  6. (optional) By default, a commit will fail if a timestamp token cannot be retrieved. If you want to make timestamping optional for a certain tsa, you can set
    git config --local --type=bool timestamping.tsa0.optional true.
    If optional is set to true and a timestamping token cannot be retrieved, you will receive a warning but the commit will be created nevertheless.
  7. (optional) You might want to add this README.md and the docs folder of this repository to your repository as well, so that documentation of the timestamps is guaranteed to be available if the timestamps should be evaluated many years in the future.

From now on, every git commit will automatically tigger an additional commit that securely timestamps it.

Implementation design

The design goals of this implementation are

A further goal was to leverage the inherent Merkle-Tree based design of git in order to create a tamperproof repository archive where no history can be rewritten without being noticed.
By embedding the timestamps in the commit history, they form a hash-chain and thus new timestamps will cryptographically seal older ones and thereby additionally protect them from some forms of future invalidation.

Merkle-Tree layout

The design leverages git's Merkle-Tree layout and embeds the timestmaps in the commit history, making them form a hash-chain that prevents later changes without being noticed:

Merkle-Tree

Or as a further simplified schematic:

Simplified Merkle-Tree

What are RFC3161 and RFC5816 Timestamps

RFC3161 (https://tools.ietf.org/html/rfc3161) and its extension RFC5816 (https://tools.ietf.org/html/rfc5816) are protocol specifications to timestamp data using cryptographically secure tokens issued by an external, trusted third party TSA (Time Stamping Authority). By timestamping data this way, it is possible to prove to anyone who trusts this TSA service that the data existed already at the time of timestamping and has not been tampered with ever since. Only a secure hash of the data, without any identification, is being sent to the TSA service, so the data itself remains secret.

Alternatives

Before writing this software, I evaluated alternatives available at the time of writing (Feb 2021). I will briefly list and discuss my findings here to outline the differences.

How are timestamps added to commits

For each commit that is being timestamped, an additional timestamp commit is created, for which the commit that is being timestamped is the direct parent. The digest hash that is contained in the timestamp token is a hash derived from the commit hash of that parent commit as well as the tree hash of this timestamp commit. Since git itself is implemented as a Merkle Tree, this hash hence depends on every bit of every commited file as well as the entire commit history, making it impossible to change anything without invalidating the timestamp. The timestamping token (one for each TSA for which a timestamp was retrieved from) is then added in PEM encoding (plus some info about the token in readable form) as a trailer to the commit message of the timestamp commit. Chosing this design to add the timestamps has several advantages:

An altenative design that was considered but dismissed was to include the timestamps right into the commit message of the commit that is being timestamped (in order to keep the commit history more tidy), in a similar fashion as PGP signatures are added. PGP signatures do this by calculating the commit hash AS-IF the signature was not contained, then sign this hash and then add the signature into the commit header (thereby changing the hash). A similar approach could have been taken with the timestamps, but this would have two serious drawbacks:

  1. Since PGP signatues are inserted natively AFTER the commit is generated, the timestamp token could therefore not timestamp the signature (instead, the signature would sign the timestamp, which is not useful).
  2. Validation code would need to compute a commit hash AS-IF the timestamp-token was not contained, but also AS-IF any insertions happening AFTER timestamping were not present (at the time of writing, that's only PGP-Signatures. However, since future versions of git may include other additional headers in a similar fashion, this would break current timestamp validation code).

For these reasons, adding timestamps right into the commit that is being timestamped was dismissed, for separate timestamp commits are much more likely to be forward compatible with anything git will add to the commit object in the future.

Additionally to retrieving TSA tokens and timestamping the commits with them, the post-commit hook will also validate these tokens first to ensure that only valid, trusted time-stamp tokens are added. It does so by validating that:

If a token does not pass these tests, it is not added and the commit is either aborted (if the TSA is not set to optional) or a warning is output (if the TSA is set to optional).

This repository uses the post-commit hook itself, so if you check the commit history of this repository, you will see that each commit is followed by a -----TIMESTAMP COMMIT----- that contains one or more timestamp tokens.
For example, this timestamp commit timestamps this regular commit, which is its direct parent. You can see that the "Digest" that is timestamped by the token is d8f311290fb12fbd87c50014b75809150d125272=sha1(version:1,parent:80034aeb7857a910f06429c3580635b4afa40cc0,tree:50c7905ec6a07547e5bdeab38a810c2dc1c5ae44), whereas the tree hash and parent hash in the preimage correspond to the parent and tree of this timestamp commit. Since github did not support sha256 hashes yet at the time this repository was created, the hashing algorithm used is sha1 (for a repository initialized with git init --object-format=sha256 the hashing algorithm will be sha256 or other, once git adds support for further algorithms)

For how long are timestamps valid?

Generally, a RFC3161 timestamp can be trusted for as long as

To be sure of the first point, one must know the current revocation status of the signing certificate. This in turn means that once new CRLs for the certificate that signed the timestamp are not issued anymore, the timestamp shouldn not be trusted anymore, since timestamps could be forged if the private key of the signing certificate leaked.
However, if one has a copy of a historic CRL that shows the signature certificate to be valid, and which is timestamped by a timestamp that still can be trusted, then also the compromised timestamp can still be trusted.
GitTrustedTimestamp enables this extension of the lifetime of timestamps by storing Long Term Validation data for previous timestamps with each new timestamp added.

LTV - Long Term Validation:

Additionally to the bare timestamp tokens stored in the commit message as trailers, the timestamp commit also adds revisioned files to the .timestampltv folder. If the timestamps should be evaluated many years in the future, when the TSA does not exist anymore and the entire certificate chains of tokens for example can't be retrieved anymore, this Long Term Validation data will help to prove the validity of the tokens. For each timestamp in a timestamp commit two types files are stored:

  1. .timestampltv/certs/issuer_hash.cer: This file contains the entire trust chain of the TSA certificate in PEM format (the first certificate being the TSA's signing certificate and the last being the self-signed root). In most cases this file will not change for subsequent timestamp tokens, so no additional data is added to the repository (the file content only changes when the TSA changes its signing certificate).
  2. .timestampltv/crls/issuer_hash.crl: This file contains CRL responses in PEM format for all certificates in the trust chains of the new timestamp tokens at the time of timestamping. In most cases, this file will not change for subsequent timestamp tokens (or only a few lines plus the signature changes), so no or very little additional data is added to the repository (the file content only changes when the CRL lists referenced by certificates in the trust chain change).

Additionally to the LTV data of the newly added tokens, also the current CRL data of timestamps in the LAST timestamp commit will be added (in most cases, these will be the same as the CRLs for the new timestamps if still the same TSAs are used). This allows to extend the validatability of timestamps arbitrarily into the future, so long as new timestamp commits are added to the repository while the CA still provides CRL data for the most recently added timestamps (which they are often required to do for many years).

The issuer_hash for these files corresponds to the ESSCertID or ESSCertIDv2 hash using which the token identifies its issuer certificate. In general this is the SHA1 hash of the DER encoded issuer certificate for RFC3161 tokens, and some other hash of the DER encoded issuer certificate for RFC5816 tokens (the ESSCertIDv2 of the token specifies the used hashing algorithm).

How to validate timestamps:

Ultimately the responsibility (and criteria) for validating the timestamps lies with the party who wishes to proof/disprove that they are valid and the policy being used.
This repository does come however with an implementation that validates the created timestamps according to the criteria listed below. To use it, simply run .git/hooks/validate.sh

validate.sh will iterate over the entire commit history of the current branch and for each timestamp commit will:

To perform these checks, the same trusted rootCAs from the .git/hooks/trustanchors folder are used. No other checks are performed. In particular, a timestamp-token is considered valid beyond the expiration date of its signing certificate if it hasn't been revoked due to a reason other than those specified in chapter 4.1 of the RFC3161 specification.
The curent implementation of the validate.sh script also does not consider whether the hashing algorithms used in the timestamp-token or the keylength of the signing certificate are still considered secure, as this depends on what the validating party considers secure.

The script exits with exit code 0 if all tests passed, and with exit code 1 otherwise.

Author