in-toto / specification

Specification and other related documents.
https://in-toto.io
MIT License
38 stars 25 forks source link

JSON parsing differences can make the same signature verify for different payloads #74

Open lukpueh opened 1 year ago

lukpueh commented 1 year ago

[from X41 specification and source code audit]

in-toto uses JSON for many of the security-relevant parts, which poses a risk due to the relatively lax definition of JSON and the factual differences of JSON parsers and serializers.

The Go implementation ignores the case of key names when parsing, but does not accept extraneous keys whereas the Python implementation is case sensitive, but ignores extraneous keys.

In either case, the JSON document can be tampered with while the cryptographic signature is still considered valid.

When the contents are the same, for instance if the uppercased "Run" key also contains a bare whoami command, then the document is currently accepted as valid in both languages. This means that the following JSON documents are all considered valid without re-signing in both languages:

$ cat variant1.json
{
  "run": ["whoami"]
}

$ cat variant2.json
{
  "run": ["whoami"],
  "Run": ["whoami"]
}

$ cat variant3.json
{
  "Run": ["whoami"],
  "ru\u006E": ["whoami"]
}

Each of the three variants verifies successfully using an identical signature in both the Python and Go implementations of in-toto. The Go implementation produces errors when there are unexpected (extraneous) keys, but since it considers differently-cased variants to be identical, the alterations are accepted. The Python implementation does not error out when an extraneous key is added, such as a differently-cased one.

One might imagine a scenario where in-toto verifies the signature of a file and then passes that file to a different program, which then parses different data (e.g., a command to execute) than the one verified by in-toto before.

"signed": {
  "_type": "layout",
  "expires": "2023-03-15T09:18:03Z",
  "inspect":
  [
    {
      "run": [
        "whoami"
      ],
      "_type": "inspection",
      "expected_materials": [],
      "expected_products": [],
      "name": "runcmd",
      "Run": [
        "rm",
        "/"
      ]
    }
  ],
  "keys": {},
  "readme": "",
  "steps": []
}

The payload shown in the snippet above would execute whoami in Python but rm / in Go because the case insensitive handling of keys in Go results in it overwriting the formerly benign command. Both languages have a policy of considering the last key with the same name as the valid one, but Go ignores case differences. In another attack scenario, the signatory might be tricked into accepting the layout file contents because they look good, while in reality a secondary key located further down in the document specifies what is truly being executed, as also demonstrated in the snippet above. The overwriting can be made less obvious by adding more information in between, obfuscating the "Run" key such as by using hex encoding, and obfuscating the malicious command's contents.

ITE-5 proposes the use the Dead Simple Signing Envelope (DSSE), where data is serialized before it is signed. This allows to verify the signature before deserialization and prevents modifications to the serialized data after signing. However, DSSE uses JSON as a container format and is thus vulnerable to some extent when parsing the container – A DSSE file might contain two pairs of payloads and signatures that are read inconsistently depending on the implementation. If the DSSE payload also consists of serialized JSON (as currently proposed), DSSE would not prevent the second attack scenario described where an attacker tricks the signatory.

Solution Advice

To minimize the attack vector, X41 recommends to use a serialization format that is more strictly defined, has less implementation differences and relies on the data structure to be defined on the serializing and deserializing ends rather than within the serialization. Alternatively, X41 recommends to implement stricter JSON document verification in both languages in addition to DSSE, such as rejecting documents with duplicate keys, ill-cased keys, and extraneous keys. However, it is important to point out that such safety measures are outside the usual handling of JSON documents and may not be followed by (or even be reasonably available to) all implementers of the in-toto specification.

adityasaky commented 1 year ago

Thinking about it some more, I don't know if the spec is the right place for this issue. We want to make in-toto serialization format agnostic, I think, like TUF? Maybe https://github.com/in-toto/docs/blob/master/in-toto-spec.md#41-metaformat needs to be updated as well.