oauth-wg / oauth-selective-disclosure-jwt

https://datatracker.ietf.org/doc/draft-ietf-oauth-selective-disclosure-jwt/
Other
56 stars 31 forks source link

Support for SD in arrays #194

Closed danielfett closed 1 year ago

danielfett commented 1 year ago

We currently do not support selective disclosure for elements in an array (we do support it for keys within objects within arrays, but not for arbitrary array elements).

Example:

The key foo in {"foo": [23, "bar", false]} can be SD, but we can't apply SD to the array elements, i.e., we cannot disclose 23 without disclosing "bar".

Proposal:

For object keys, we use the new key _sd to collect the disclosure hashes. For arrays, we would need to mark the array as selectively disclosable, e.g., by using a special first element in the array like this:

{"foo": ["_sd", "7pHe1uQ5uSClgAxXdG0E6dKnBgXcxEO1zvoQO9E5Lr4", "9-VdSnvRTZNDo-4Bxcp3X-V9VtLOCRUkR6oLWZQl81I", "nTzPZ3Q68z1Ko_9ao9LK0mSYXY5gY6UG6KEkQ_BdqU0"]}

The three hashes correspond to the three elements in the array. The verifier can resolve the disclosed elements into their original values and ignore/replace the other values. Order is preserved. This works for arbitrary element types.

This would not allow to mix SD and non-SD elements in an array, but it seems unlikely that that would be a problem. WDYT @bc-pi @Sakurann?

bc-pi commented 1 year ago

IMHO the need to allow for selective disclosure of individual elements in an array doesn't warrant the additional complexity. Which, by itself isn't particularly complicated. But it is different than the general model we've got now that covers a name and value (of any type) in the same way at any level. The model is consistent and conceptually simple. I might go so far as to say it's elegant. Is there a real concrete requirement or use case for this that is likely to be commonly occurring? And couldn't be achieved modeling the data a little differently, when/if necessary.

danielfett commented 1 year ago

There's a concrete use case in the eKYC syntax where the evidence element may contain multiple descriptions of documents etc., not all of which are necessary to disclose for all use cases. With the current syntax, either all documents need to be revealed or at least the structure of undisclosed documents is visible to a verifier. This is not ideal.

@tlodderstedt also had another use case where, for example, multiple types of holder binding are encoded into the SD-JWT and only one needs to be disclosed for using the credential (see https://github.com/oauth-wg/oauth-selective-disclosure-jwt/pull/193/files#diff-573098781f9a66e1e4eb42edf9799c0ea4dc69fed1db6b72805aec27563eafe7).

bc-pi commented 1 year ago

I must admit that I don't find those cases terribly compelling. But point taken nonetheless.

@tlodderstedt also had another use case where, for example, multiple types of holder binding are encoded into the SD-JWT and only one needs to be disclosed for using the credential (see https://github.com/oauth-wg/oauth-selective-disclosure-jwt/pull/193/files#diff-573098781f9a66e1e4eb42edf9799c0ea4dc69fed1db6b72805aec27563eafe7).

Is that holder binding syntax actually defined (or aspiring to be) somewhere? I couldn't find it looking through the VC WG publications and google/bing search for ClaimsBindinding2022 gives zero results.

bc-pi commented 1 year ago

The content of disclosures (https://datatracker.ietf.org/doc/html/draft-ietf-oauth-selective-disclosure-jwt-02#name-creating-disclosures ) would need to conditionally omit the name when for a value in an array.

We might also want to think about a prefix on the value vs a special first element in the array - e.g, {"foo": ["_sd:7pHe1uQ5uSClgAxXdG0E6dKnBgXcxEO1zvoQO9E5Lr4", "_sd:9-VdSnvRTZNDo-4Bxcp3X-V9VtLOCRUkR6oLWZQl81I", "_sd:nTzPZ3Q68z1Ko_9ao9LK0mSYXY5gY6UG6KEkQ_BdqU0"]}. I dunno, it's a little bikesheddy but would allow mixing of SD and non-SD elements in an array.

With that said, I still really like the consistency and simplicity of the current model and don't love the idea of expanding it.

danielfett commented 1 year ago

The prefix solution has the advantage that the number of elements in the array is unchanged. For the other questions, it would be great if @tlodderstedt could chime in :-)

tlodderstedt commented 1 year ago

The most obvious example is the EBSI schema for diplomas.

https://ec.europa.eu/digital-building-blocks/code/projects/EBSI/repos/json-schema/browse/schemas/ebsi-muti-uni-pilot/verifiable-diploma/2022-11/examples/Ildiko%20Mazar_Graduate%20University%20Study%20of%20Civil%20Engineering_shortened.json

When I tried to transform this into a sd-jwt, I learned it uses arrays on various levels for the different objects (e.g. achievements). I would assume it is desirable to be able to selectively disclose those.

I have also worked on an example of an eID (PID in eIDAS terms) VC that has different kins of holder binding. The syntax is inspired by a proposal Oliver Terbu and Paul Bastian presented at IIW.

{
    "iss": "https://pid_issuer.memberstate.eu",
    "iat": 1541493724,
    "type": "PersonIdentificationData",
            "holder":
            {
                "binding":
                    [
                        {
                            "type": "CryptographicBinding2022",
                            "did": "did:example:1386147674571545"
                        },
                                                {
                            "type": "BiometricBinding2022",
                            "template": "..."
                        },
                    ],
            },
    "credentialSubject":
      {
        "given_name": "Erika",
        "family_name": "Mustermann",
        "nationalities": ["DE"],
        "birth_family_name": "Schmidt",
        "birthdate": "1973-01-01",
        "place_of_birth": "Regensburg",
        "address":
          {
            "postal_code": "12345",
            "locality": "Irgendwo",
            "street_address": "Sonnenstrasse 23",
            "country_code": "DE"
          },
        "is_over_18": true,
        "is_over_21": true,
        "is_over_65": false
      }
  }

Both entries define a certain binding (one to a cryptographically bound identifier, the other one has a biometric template). Only one of them is relevant for a certain verifier (online - crypto, offline/supervised - biometry). So I would like to disclose them individually.

danielfett commented 1 year ago

Another option would be to replace the array elements by something like this: {"_sd": "9-VdSnvRTZNDo-4Bxcp3X-V9VtLOCRUkR6oLWZQl81I"} (note that the value here is not an array). Also just bikeshedding here.

Sakurann commented 1 year ago

this claimset

"claims": {
      "given_name": "Max",
      "family_name": "Müller",
      "nationalities": [
        "DE",
        "JPN"
      ]

would be

"claims": {
        "_sd": [
          "1qb26tNg6OZuZyVDYwK4--mQxXbZqwcQbhUxGHrXeLM",
          "AHX0EgNpd_wak07lK8HX2izDNntsUZojuzyEWd2GJdk",
          "FwzTz0THaEOzexgEzLRXu-zsTPND7by3aBF57AwKCZI",
        ],
        "nationalities" : [
          "_sd", 
          "7pHe1uQ5uSClgAxXdG0E6dKnBgXcxEO1zvoQO9E5Lr4", 
          "9-VdSnvRTZNDo-4Bxcp3X-V9VtLOCRUkR6oLWZQl81I"
        ]

or

"claims": {
        "_sd": [
          "1qb26tNg6OZuZyVDYwK4--mQxXbZqwcQbhUxGHrXeLM",
          "AHX0EgNpd_wak07lK8HX2izDNntsUZojuzyEWd2GJdk",
          "FwzTz0THaEOzexgEzLRXu-zsTPND7by3aBF57AwKCZI",
        ],
        "nationalities" : [
           "_sd:7pHe1uQ5uSClgAxXdG0E6dKnBgXcxEO1zvoQO9E5Lr4", 
           "_sd:9-VdSnvRTZNDo-4Bxcp3X-V9VtLOCRUkR6oLWZQl81I"
         ]
TakahikoKawasaki commented 1 year ago

As I'm a newbie in SD-JWT discussion, I don't know whether this issue has already reached an agreed solution or not. But if not yet, another approach that may be worth considering is to include an array index in the claim name when a disclosure is prepared. For example, ...

Plain Payload

{
  "array": [ "value0", "value1" ]
}

Disclosure for the first element

[ "{salt}", "array[0]", "value0" ]

Payload with _sd

{
  "_sd": [ "{digest of the Disclosure}" ],
  "array": [ null, "value1" ]
}
jogu commented 1 year ago

The nationalities example is kind of interesting - if I've understood the proposals, all of them require that disclosing one of your nationalities requires disclosing that you have more than one nationality. i.e. if I'm French & Iranian, I think there's no way to selectivity disclose "I'm a French national" without disclosing "I also have another nationality" because [even if the nationality claim was not previously disclosed] disclosing one element will disclose the number of elements in the array?

TakahikoKawasaki commented 1 year ago

@jogu I suppose that decoy digests can be used for the concern.

danielfett commented 1 year ago

Taka's idea would nicely work around two problems that my proposals above have:

I would like to entertain a slightly modified version of Taka's proposal:

  1. To avoid string manipulation, escaping problems, and a temptation to use jsonpath or similar, the disclosures would not encode the element index in the key name, but separately - like this:
[ "{salt}", ["array", 2], "value0" ]

Admittedly, this would introduce polymorphism, but it would not be much worse than with my approach above.

  1. I would not allow non-sd plaintext values in the array in the SD-JWT, but for simplicity constrain arrays to "always disclosed" or "always sd". The array in the SD-JWT would just be omitted:
{
  "_sd": [ "{digest of the Disclosure}" ]
}
TakahikoKawasaki commented 1 year ago

Encoding an array name and an index into an array like ["array", 2] seems a good idea.

bc-pi commented 1 year ago

Daniel's slightly modified version of Taka's proposal seems like the way to go.

danielfett commented 1 year ago

... and I'm just working on implementing this in the SD-JWT reference implementation.

danielfett commented 1 year ago

While implementing this I noticed a problem with the recently chosen approach: This approach relies on the fact that an array must be nested within an object. This means that

@TakahikoKawasaki Did you encounter any of these problems? What are your thoughts?

I'm thinking about going back to the prefix variant, which suddenly seems much more attractive :-)

TakahikoKawasaki commented 1 year ago

@danielfett Good points. I didn't imagine the cases you mentioned. 😅

Top-level Array

If you mean the data like below,

[ "apple", "banana", "cherry" ]

the following JSON for disclosure may work by using null as the array name

[ "<salt>", [ null, 0 ], "apple" ]

and creating an outer JSON object like below.

{
  "_sd": [ "<digest0>" ]
}

However, this approach would make it difficult to judge, if not possible, whether the outer JSON object has existed from the beginning or has been added just for the top-level anonymous array.

Therefore, my gut feeling tells that "_sd" for array elements should exist in the array whichever approach (an additional "_sd" element or _sd: prefix) is used.

Nested Array

This needs special considerations. At least, it seems difficult to create disclosures and "_sd" arrays for the following data with the ["salt",["array-name",index],value] approach...

[ [ "apple", "banana", "cherry" ] ]

A conclusion is that discussions need to continue. 😅

bc-pi commented 1 year ago

the approach does not work for top-level arrays (without defining another exception, like a "null" array name), the approach does not work for arrays nested in arrays

Offhand, I feel like those could be considered acceptable limitations and that allowing for SD in top-level arrays or inside arrays nested in arrays isn't necessary.

danielfett commented 1 year ago

Thanks @TakahikoKawasaki! I think I agree to your conclusions.

@bc-pi I'd like to add that I'm actually less concerned about the restrictions themselves but feel that they (and the need to explain them) are strong indicators of a less-than-ideal approach.

bc-pi commented 1 year ago

indicators of a less-than-ideal approach.

That's a fair/good point :) All the approaches thus far have had (in my mind anyway) some indicators of being less-than-ideal though. I'm not sure there's an obvious "best" one, so I'm not necessarily advocating for one over the other. I'm just trying to "contribute" to the discussions.

danielfett commented 1 year ago

I implemented the prefix solution now, but I introduced a mechanism to avoid one concern that I had. The basic idea is to use arrays like this:

  "nationalities": [
    "_sd:Q7R_-cBP9LWCq9At1XWNRZyLTFHOr0S9fLcXQjyBgH4",
    "_sd:o9qCZPD-_n0pa9nH_sBxtVKXuDyx1ALQjzYPrOJ3p4s",
    "DE"
  ],

(Here, the third element is non-SD.)

The main concern that I had with this solution is that there may be conflicts. If somebody has data where _sd can appear as part of the data (for example, user-supplied data), the processing will not be correct. Therefore I propose to use _sd: as the default prefix, but to allow definition of a different prefix by adding the top-level key _sd_arr_pfx. For example, the prefix can consist of a nonce or another string that is guaranteed to not appear in real data items. This approach is similar to boundaries in multipart MIME. (There is no need to use _sd_arr_pfx if there are no arrays with SD or if the default prefix is used.)

The following is a full SD-JWT payload using the new array feature and defining _sdx: as a prefix:

{
  "_sd": [
    "sGmV2tSLHmJScETevXgTQ-bM7O5ZnQuu-ypqI2vB-JU"
  ],
  "iss": "https://example.com/issuer",
  "iat": 1683000000,
  "exp": 1883000000,
  "sub": "john_doe_42",
  "nationalities": [
    "_sdx:Q7R_-cBP9LWCq9At1XWNRZyLTFHOr0S9fLcXQjyBgH4",
    "_sdx:o9qCZPD-_n0pa9nH_sBxtVKXuDyx1ALQjzYPrOJ3p4s",
    "DE"
  ],
  "is_over": {
    "_sd": [
      "NKZs2QqvniVtS3k-YXxMag_PiyUQizlgdsXgfIEWZcs",
      "hLJKFgko4IvkO_R8lbX3xNRcaEo0t0awFMnrO0dXdvg",
      "hN7ybNRpz_UIZAH4rPTNl_c07JyUQtzHlAwuyVrsQgs"
    ]
  },
  "addresses": [
    {
      "street": "123 Main St",
      "city": "Anytown",
      "state": "NY",
      "zip": "12345",
      "type": "main_address"
    },
    "_sdx:TDx4IHvi4gxmIGbEKZa4AM6PYRIHtP5VxjraME72Nh8"
  ],
  "null_values": [
    null,
    "_sdx:YImgtY5gfEpLKDA8PQ93hkUCeAL0lz-UKsnK1IGJHFo",
    "_sdx:duZZCCrTo-ROiWT8uEpPkgu_XnpsIWDtXhOqSOJ1EEo",
    null
  ],
  "data_types": [
    "_sdx:llHOoLZriL1NCNBOB3lLj5cFhZS2I4UvXeWIofOSyzU",
    "_sdx:5h5X-YP38eRr7yS1sydUGJzbTXQiYoZy2CELGTpy63w",
    "_sdx:ZcGhdemizhDvOKiM3huX69MO5MJ3k_6N4TDADYi1KI8",
    "_sdx:MtEhyiQLsysJR9x6XgGPo2AS_audhRVXEZ3GsNVM30o",
    "_sdx:auIGvdZGiGSzFmMXxM2ErtbN-5h-y0BIeFsl_aDEN48",
    "_sdx:HTw5A7z-pzJ8RI37pC9Z2-1IyM-ZjVYG-iUkhpV4Ahw",
    "_sdx:0MZET02ximXt6FxwBPOsEBUxuo_OBlNxQsfmRiatBeY"
  ],
  "nested_array": [
    [
      "_sdx:xlEWQP4kR9414Kyp5YqMNOBFMlcTa4zqR8ueeSVUCbs",
      "_sdx:hcy7DI4AiQaPPTw40V6NYGzllLikLeyifv43a7SaX6w"
    ],
    [
      "_sdx:czFUZij2d-W1nAOU68i6khwwphueeOyTwSJDCmh7gIk",
      "_sdx:nLEUS7NSV2EUExiiET2itDjWi6dzV7re0Btcf0omUso"
    ]
  ],
  "array_with_recursive_sd": [
    "boring",
    "_sdx:uaf1fEDM93C4zWX_PZlGCgcfkEgvMpEJUCCFozWCccY",
    [
      "_sdx:DWRkboVZ-cTya_WGt0-vaaQVmETozAtFip68mxU1Z0I",
      "_sdx:RfQ40cSzxPe494mtOEbWjgA-ymegpiBiPyGTz2dyS4M"
    ]
  ],
  "_sd_alg": "sha-256",
  "cnf": {
    "jwk": {
      "kty": "EC",
      "crv": "P-256",
      "x": "TCAER19Zvu3OHF4j4W4vfSVoHIP1ILilDls7vCeGemc",
      "y": "ZxjiWWbZMQGHVWKVQ4hbSIirsVfuecCE6t4jT9F2HZQ"
    }
  },
  "_sd_arr_pfx": "_sdx:"
}

The code is in the sd-jwt repo, this is not finished yet but you should be able to play with the examples: https://github.com/danielfett/sd-jwt/pull/4

Let me know what you think!

TakahikoKawasaki commented 1 year ago

Where to embed "_sd_arr_pfx" (and "_sd_alg") in the case of a top-level anonymous array?

danielfett commented 1 year ago

That case is so far not covered by the spec. I think that might be acceptable.

bc-pi commented 1 year ago

Regular old JWT itself requires the JWS/JWE payload be a JSON object so that probably is acceptable (even with trying to be more accommodating to arbitrary JSON payloads).

TakahikoKawasaki commented 1 year ago

Another idea came up (brainstorming). Converting any element to a map which contains only "_sd" as a key. This approach conflicts with the current "_sd" array approach. I've not examined yet whether this approach would be able to work for all cases, though. Just showing an idea.

Example 1

["A", "B"]

⬇️

[
  {"_sd":"digest of [salt, 0, A]"},
  {"_sd":"digest of [salt, 1, B]"}
]

Example 2

["A", [ "B", "C" ]]

⬇️

[
  {"_sd":"digest of [salt, 0, A]"},
  [
    {"_sd":"digest of [salt, 0, B]"},
    {"_sd":"digest of [salt, 1, C]"}
  ]
]

Example 3

{
  "a": "A",
  "b": [
    "B",
    {"c": "C"},
    {"d": "D"},
    ["E"],
    ["F"]
  ]
}

⬇️

{
  {"_sd":"digest of [salt, a, A]"},
  "b": [
    {"_sd":"digest of [salt, 0, B]"},
    {
      {"_sd":"digest of [salt, c, C]"}
    },
    {"_sd":"digest of [salt, 2, {d:D}]"},
    [
      {"_sd":"digest of [salt, 0, E]"}
    ],
    {"_sd":"digest of [salt, 4, [F]]"}
  ]
}
TakahikoKawasaki commented 1 year ago

Sorry, the above examples are malformed as JSON, probably.

TakahikoKawasaki commented 1 year ago

I meant the following is wrong.

{
  {"_sd": "digest"}
}

In the case of a JSON object, an "_sd" array would work. In the case of array elements, a JSON object containing an "_sd" key would work.

{
  "a": "A",
  "b": ["B"]
}
{
  "_sd": [
    "digest of [salt, a, A]"
  ],
  "b": [
    {"_sd":"digest of [salt, 0, B]"}
  ]
}
TakahikoKawasaki commented 1 year ago

Or, it may be possible to make "_sd" be always an array.

{
  "_sd": [
    "digest of [salt, a, A]"
  ],
  "b": [
    {
      "_sd": [
        "digest of [salt, 0, B]"
      ]
    }
  ]
}
danielfett commented 1 year ago

I implemented the solution using objects {"_sd": "digest"} instead of strings "_sd:digest" as well now and I must say that both are very similar. When iterating through an array, it is slightly easier to check for a string prefix ("_sd:...") instead of checking that an entry is

in order to avoid confusion with an object containing SD'd keys. I like about this solution that it doesn't involve string handling and it most likely doesn't need the _sd_arr_pfx construction to avoid conflicts. First of all, we're using the already defined "_sd", and second, we're only using it as a key value (which is generally less likely to contain very variable or user-supplied data).

For the disclosures, I think it makes sense to stick to a two-element format (["eluV5Og3gSNII8EYnsxA_A", "CA"]). As Brian pointed out yesterday, we should keep the Disclosures short. And we don't need the position in the array if the position is already encoded in the SD-JWT itself.

Here is a full example:

{
  "_sd": [
    "sGmV2tSLHmJScETevXgTQ-bM7O5ZnQuu-ypqI2vB-JU"
  ],
  "iss": "https://example.com/issuer",
  "iat": 1683000000,
  "exp": 1883000000,
  "sub": "john_doe_42",
  "nationalities": [
    {
      "_sd": "i7eKdHc_ZMOnhiyu3TJj5GVDQ7ZwJOMXFD3XgUbo8GQ"
    },
    {
      "_sd": "usWXFPKaqKMreTrj72QD24wB8xc7lQ4zCnrnn8ZRVeo"
    },
    "DE"
  ],
  "is_over": {
    "_sd": [
      "2ovMJR_ZNMB6ngFK3SUQnRIgyM548DzR7tJFTO-ZzBM",
      "CeVqxVUVHpva5Xp0X-NeUvhixjDYp7PTZ4BaFWGXUek",
      "dg1pBJV-dABilqD2RYiG8z4gRtuDFdRBdlwHgdLFEx8"
    ]
  },
  "addresses": [
    {
      "street": "123 Main St",
      "city": "Anytown",
      "state": "NY",
      "zip": "12345",
      "type": "main_address"
    },
    {
      "_sd": "RNWcxPD8A1ZhAm6_wAiJSoSzIRb_w1QUaKGvS240K-Y"
    }
  ],
  "null_values": [
    null,
    {
      "_sd": "hhB5pziS4s0dSx0kql31vDtuo3JVDfB4VZ-YHcj2A9M"
    },
    {
      "_sd": "o_VFRluA190wrH5E1yr2r39UyTnx3-m3qPREikSr6Qo"
    },
    null
  ],
  "data_types": [
    {
      "_sd": "nY72P6V5uHQe-BYkwYj-paG2y3fmj614FKQQhhk6T1E"
    },
    {
      "_sd": "zt7kWPtZTpMYKPoaQd-L71L-aKYMYYNLOFOf-yH3uLY"
    },
    {
      "_sd": "K1yxHJ4z10JKd2jRmQuziCym3D1oXB0NaFVLHEOv8XM"
    },
    {
      "_sd": "yr-1NDhAaFYPvLrAzvdFfBwRJS_wn199JX0adDYa6Ak"
    },
    {
      "_sd": "NoTOTjWq1_cYu3kfQKh3jWrx9OLSIIdhYX0_92-RD-Y"
    },
    {
      "_sd": "biBLCP424CoDYTpBmden-zGmYOdE0GSHlerSaoYeQZ0"
    },
    {
      "_sd": "_z-We_gbvKo84jpuhBQS9v9yVhDo2--FCDNWMHMezUQ"
    }
  ],
  "nested_array": [
    [
      {
        "_sd": "FbJ_W_M-Gl9rMUR8fcsMFdiHV-qEabiT-u9eHvNKSAA"
      },
      {
        "_sd": "zo6muzFQJ9UCeFuy3Dq_YInQzLGimJVIztHGntWVxw0"
      }
    ],
    [
      {
        "_sd": "n-TOQDur9EA2k9G_VVqlvkOYCzIFb28LKA99IaQfFt8"
      },
      {
        "_sd": "xvJ7NwhRY93UqhcqVKF-Ap7HwZpKe1raEWZg_WozBBs"
      }
    ]
  ],
  "array_with_recursive_sd": [
    "boring",
    {
      "_sd": "KIK9FOQ3C-jLxGW9oRYTL-AETF3eGolP8lyVRVFOqX8"
    },
    [
      {
        "_sd": "uN8DYtT68Do3MAO9deTagWZx-akgd6DmzI4x9xFN7bs"
      },
      {
        "_sd": "F5STX6452Aw9VQyFh5vclX-SlUAuu_r_ax-ow35e4Jw"
      }
    ]
  ],
  "_sd_alg": "sha-256",
  "cnf": {
    "jwk": {
      "kty": "EC",
      "crv": "P-256",
      "x": "TCAER19Zvu3OHF4j4W4vfSVoHIP1ILilDls7vCeGemc",
      "y": "ZxjiWWbZMQGHVWKVQ4hbSIirsVfuecCE6t4jT9F2HZQ"
    }
  }
}

Array Entry:

Array Entry:

Claim 13:

Claim 18:

Claim 21:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Array Entry:

Claim baz:

Array Entry:

Array Entry:

Array Entry:

__Claim sd_array__:

danielfett commented 1 year ago

And here is another example showing both an SD'd array element and a single-key object with SD:

{
  "iss": "https://example.com/issuer",
  "iat": 1683000000,
  "exp": 1883000000,
  "addresses": [
    {
      "street": "123 Main St",
      "city": "Anytown",
      "state": "NY",
      "zip": "12345",
      "type": "main_address"
    },
    {
      "_sd": "k63-wMoGu03I9dCyrrNnB0ncOXLZhYaA_Q4lCKFIWcU"
    }
  ],
  "array_with_one_sd_object": [
    {
      "_sd": [
        "1R6ziZ1b4uvXf4-DuKx0JSDRoeVTGrzJldw7Jgqac3Q"
      ]
    }
  ],
  "_sd_alg": "sha-256"
}
bc-pi commented 1 year ago

Would you say you prefer the object {"_sd": "digest"} based approach over the string "_sd:digest" based approach, @danielfett? It does seem "more correct" to me. And not needing the _sd_arr_pfx construction is a plus. Looking at the examples you have, I must admit that I don't love the aesthetics of it. But that shouldn't be part of the criteria for choosing.

using the already defined "_sd"

This is true and I agree that using a special key name is cleaner than a string prefix and much better avoids conflict. However, we'd need to be a little careful with it's treatment in the text. There's a lot of current text that would need to be adjusted to allow for the _sd key to have different syntax and semantics based on where it is in the JSON. This sentence and these steps are just two examples. Which might be tricky and risky. We might want to use a new special key name. Maybe _sda or _sdae or something (roughly for selectively disclosable array element but I'm just throwing out ideas). A different key might also be easier to see the difference when just looking at the JSON.

danielfett commented 1 year ago

The {"_sd": "digest"} feels cleaner, the "_sd:digest" looks a bit better. But I have a slight preference for the first one. What do @TakahikoKawasaki @Sakurann think?

I think that using a different key makes sense. It is easier to implement ("check for _sde" instead of "check for a single-element object, where _sd refers to a string not an array") and we avoid polymorphism (_sd is always an array, _sde is always a string).

Now for the bikeshedding part:

Any other ideas? I think I like ... and _se most.

bc-pi commented 1 year ago

Any other ideas?

just more bikeshedding but perhaps _sdi or _si where the i is loosely for item in the array

Sakurann commented 1 year ago

I vote for {"_sd": "digest"} because it avoids the concern we had earlier on the collision of the _sd: prefix and potentially introducing "_sd_arr_pfx". and it also feels cleaner and is easier to check for _sd from Daniel's experience.

Sakurann commented 1 year ago

I also like ... don't want to think every single time what _sa, _si, _sde etc stands for and other characters proposed feel like might cause conflicts. though _: looks cute..

danielfett commented 1 year ago

Pull request for spec text: #283

Sakurann commented 1 year ago

PR merged