secure-systems-lab / securesystemslib

Cryptographic and general-purpose routines for Secure Systems Lab projects at NYU
MIT License
48 stars 49 forks source link

Update JSON canonicalization (backwards incompatible) #159

Closed lukpueh closed 1 year ago

lukpueh commented 6 years ago

Description of issue or feature request: Securesystemslib provides a custom json canonicalization function based on this Canonical JSON specification.

The specification seems outdated, or at least is not compatible with newer and more detailed specifications, such as the gibson042/canonicaljson-spec, for which a Go implementation exists.

The Notary Go implementation of TUF uses its own canonical JSON implementation which (IIUC) does not conform with any of above two specifications, but looks similar to the latter.

Current behavior: securesystemslib uses an outdated JSON canonicalization specification.

Expected behavior: I wonder if we should update securesystemlib's JSON canonicalization, or, given that there is no single accepted specification, switch to something that has wider cross-language support?

Note: I am well aware that this is a bigger request, as it would break backwards compatibility for metadata signatures used in TUF and in-toto, and would therefor require a TAP or ITE.

lukpueh commented 6 years ago

Here is a shell snippet that tests securesystemslib's encode_canonical function against the gibson042/canonicaljson-spec.

mkvirtualenv secsyslib-canonical # Optional

pip install securesystemslib
git clone git@github.com:gibson042/canonicaljson-spec.git
cd canonicaljson-spec

cat >secsyslib_canonical_json.py <<EOL
#!/usr/bin/env python
import sys, json, securesystemslib.formats

# Load JSON file passed as first argument by test.sh
with open(sys.argv[1]) as fobj:
  json_data = json.load(fobj)

json_canonical = securesystemslib.formats.encode_canonical(json_data)

# Write canonicalized json to stdout from where it is read by test.sh
sys.stdout.write(json_canonical)
EOL
chmod 755 secsyslib_canonical_json.py

PATH=$PATH:${PWD} ./test.sh secsyslib_canonical_json.py
JustinCappos commented 6 years ago

Note, I don't think this requires a TAP or ITE, assuming it is just changing the implementation's canonicalization format. If you wanted to mandate a specific format that all implementers must use, then this would require a TAP, ITE, etc.

On Wed, Oct 17, 2018 at 6:54 AM lukpueh notifications@github.com wrote:

Description of issue or feature request: Securesystemslib provides a custom json canonicalization function https://github.com/secure-systems-lab/securesystemslib/blob/master/securesystemslib/formats.py#L752 based on this Canonical JSON specification http://wiki.laptop.org/go/Canonical_JSON.

The specification seems outdated, or at least is not compatible with newer and more detailed specifications, such as the gibson042/canonicaljson-spec https://gibson042.github.io/canonicaljson-spec/, for which a Go implementation https://github.com/gibson042/canonicaljson-go exists.

The Notary Go implementation of TUF uses https://github.com/theupdateframework/notary/blob/master/tuf/signed/verify.go#L9 its own canonical JSON implementation https://github.com/docker/go/tree/master/canonical/json which (IIUC) does not conform with any of above two specifications, but looks similar to the latter.

Current behavior: securesystemslib uses an outdated JSON canonicalization specification.

Expected behavior: I wonder if we should update securesystemlib's JSON canonicalization, or, given that there is no single accepted specification, switch to something that has wider cross-language support?

Note: I am well aware that this is a bigger request, as it would break backwards compatibility for metadata signatures used in TUF and in-toto, and would therefor require a TAP or ITE.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/secure-systems-lab/securesystemslib/issues/159, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0XD3HEjn34enGgqZs2U8a1DDWbVNKFks5ulwx3gaJpZM4XjplN .

lukpueh commented 6 years ago

Not sure about the exact requirements for TAP/ITE. However, the currently used canonical json format is also mandated by our specs (see in-toto: 4.1 Metaformat and TUF: 4.1 Metaformat).

lukpueh commented 6 years ago

@JustinCappos I apologize for misreading the specs.

You were right, the in-toto specification does not mandate the OLPC canonicalization specification. As a matter of fact it explicitly states that implementers are not required to use JSON:

To provide descriptive examples, we will adopt "canonical JSON," as described in http://wiki.laptop.org/go/Canonical_JSON, as the data format. However, applications that desire to implement in-toto are not required to use JSON.

The TUF specification, however, leaves room for interpretation whether it is mandated or not:

All documents use a subset of the JSON object format, with floating-point numbers omitted. When calculating the digest of an object, we use the "canonical JSON" subdialect as described at http://wiki.laptop.org/go/Canonical_JSON

JustinCappos commented 6 years ago

Okay, thanks for revisiting this and clarifying.

Regardless, we should rule on whether the wireline format is meant to be covered by the spec. Marina is drafting a TAP that will help to clarify all of this, likely by saying there will be separate "profiles" (wire line formats) that may be also standardized for each spec.

On Wed, Nov 7, 2018 at 9:26 AM lukpueh notifications@github.com wrote:

@JustinCappos https://github.com/JustinCappos I apologize for misreading the specs.

You were right, the in-toto specification does not mandate the OLPC canonicalization specification. As a matter of fact it explicitly states that implementers are not required to use JSON https://github.com/in-toto/docs/blame/master/in-toto-spec.md#L459-L465:

To provide descriptive examples, we will adopt "canonical JSON," as described in http://wiki.laptop.org/go/Canonical_JSON, as the data format. However, applications that desire to implement in-toto are not required to use JSON.

The TUF specification, however, leaves room for interpretation whether it is mandated or not https://github.com/theupdateframework/specification/blame/master/tuf-spec.md#L457-L460 :

All documents use a subset of the JSON object format, with floating-point numbers omitted. When calculating the digest of an object, we use the "canonical JSON" subdialect as described at http://wiki.laptop.org/go/Canonical_JSON

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/secure-systems-lab/securesystemslib/issues/159#issuecomment-436639966, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0XD3virB5P35FG3nfgvDA7lw4taQlQks5usu2rgaJpZM4XjplN .

lukpueh commented 6 years ago

FWIW: @vladimir-v-diaz and @heartsucker had a lively discussion about this a year ago. The corresponding ticket -- https://github.com/theupdateframework/tuf/issues/457 -- is still open.

joshuagl commented 4 years ago

This might be a good topic for the next TUF community meeting.

Are TUF users/implementers happy with Canonical JSON? If so, do we want to try and rally around a Canonical JSON specification? If not, do we want to try and switch to something else with wider cross-language support?

heartsucker commented 4 years ago

See this for some of the reasons why it shouldn't use CJSON. https://github.com/theupdateframework/tuf/issues/457

joshuagl commented 4 years ago

I sent a mail to the TUF discussion group seeking input from authors and implementers on this topic: https://groups.google.com/d/msg/theupdateframework/xuT5wDA8kh8/Mb0N1umxAAAJ

mnm678 commented 4 years ago

The TUF specification no longer requires canonical json as of theupdateframework/specification#102.

lukpueh commented 1 year ago

Rather than switching to another canonicalization spec, we should encourage use of DSSE instead. Closing here.