w3c / vc-data-integrity

W3C Data Integrity Specification
https://w3c.github.io/vc-data-integrity/
Other
41 stars 18 forks source link

Contradicting proof `created` and `expires` normative statements. #294

Closed PatStLouis closed 1 month ago

PatStLouis commented 1 month ago

Both of these fields denote the following:

if included, MUST be specified as an [XMLSCHEMA11-2] dateTimeStamp string, either in Universal Coordinated Time (UTC), denoted by a Z at the end of the value, or with a time zone offset relative to UTC.

Followed by:

Time values that are incorrectly serialized without an offset MUST be interpreted as UTC.

These are 2 contradicting statements, as per the [XMLSCHEMA11-2] dateTimeStamp specification:

[Definition:] The dateTimeStamp datatype is ·derived· from dateTime by giving the value required to its explicitTimezone facet. The result is that all values of dateTimeStamp are required to have explicit time zone offsets and the datatype is totally ordered.

I suggest to remove the latter statement.

msporny commented 1 month ago

I suggest to remove the latter statement.

-1 because that statement came out of an Internationalization WG review and if we removed it, we'd have to ask them for another review, which could take months. In addition, it's good advice, which I'll try to explain below:

Time values that are incorrectly serialized without an offset MUST be interpreted as UTC.

This was added to address the "Yes, but what happens when a developer makes a mistake and a verifier still wants to fix/consume the value?" I realize that doing so presumes that you're going to ignore the first MUST statement, but sometimes people do that, and if they do that, we provide clear guidance on how values w/o zone offsets (even though they are wrong) will be interpreted... which improves interoperability.

A couple of things we could do here:

  1. Change the MUST in the 2nd statement to will (so the test suite doesn't have to cover it).
  2. Explain why we make that statement so it's not as confusing as it is now.

Thoughts?

dlongley commented 1 month ago

@msporny,

I think we can cover both "couple of things" by changing the 2nd sentence to:

"If time values that are incorrectly serialized without an offset are consumed, they are expected to be interpreted as UTC."

PatStLouis commented 1 month ago

I would like a statement that makes it clear that an implementer can still decide to reject a non conformant proof.

If time values that are incorrectly serialized without an offset are consumed, they MAY be interpreted as UTC.

dlongley commented 1 month ago

@PatStLouis,

I would like a statement that makes it clear that an implementer can still decide to reject a non conformant proof.

Hmm, that's what I was trying to say by "If... are consumed". I think we want a "MAY" on their consumption, but if they are consumed, the interpretation needs to be as UTC. Maybe we use the use "accepted anyway" instead of "consumed"?

"If time values that are incorrectly serialized without an offset are accepted anyway, they are expected to be interpreted as UTC."

EDIT: And maybe drop "expected" to simplify the language:

"If time values that are incorrectly serialized without an offset are accepted anyway, they are to be interpreted as UTC."

dlongley commented 1 month ago

Perhaps even more clear:

"If time values that were incorrectly serialized without an offset are accepted by a consumer anyway, they are to be interpreted as UTC."

This makes clear that we're talking about consumers of the information (not authors) and that the serialization was incorrectly made in the past (by the author).

PatStLouis commented 1 month ago

+1 for @dlongley suggestion

PatStLouis commented 1 month ago

Here's something that combines both: "A consumer MAY chose to accept time values that were incorrectly serialized without an offset. Incorrectly serialized time values without an offset are to be interpreted as UTC."

BigBlueHat commented 1 month ago

FWIW, this is also in the VCDM:

Time values that are incorrectly serialized without an offset MUST be interpreted as UTC. https://w3c.github.io/vc-data-model/#validity-period:~:text=Time%20values%20that%20are%20incorrectly%20serialized%20without%20an%20offset%20MUST%20be%20interpreted%20as%20UTC.

Though as yet, untested: https://github.com/w3c/vc-data-model-2.0-test-suite/blob/main/tests/10-vcdm2.js#L632-L640

PatStLouis commented 1 month ago

@BigBlueHat Interesting, so a big difference that I see is that the sentence prior to that statement in VCDM is a SHOULD whereas here its a MUST:

In order to reduce misinterpretations between different time zones, all time values expressed in conforming documents SHOULD be specified in dateTimeStamp format, either in Universal Coordinated Time (UTC), denoted by a Z at the end of the value, or with a time zone offset relative to UTC.

However, the prior validFrom statements is a MUST:

If present, the value of the validUntil property MUST be an [XMLSCHEMA11-2] dateTimeStamp string value

So there's a MUST, and afterwards this same statement is overwritten by a SHOULD.

AFAIK XMLSCHEMA11-2 requires the offset

msporny commented 1 month ago

so a big difference that I see is that the sentence prior to that statement in VCDM is a SHOULD whereas here its a MUST

Good catch @PatStLouis, we definitely need to fix that.

PatStLouis commented 1 month ago

To provide more feedback around why this is an issue to me:

As an implementer of this specification, I configured my software around the data model specified here. I strive to issue documents which are conformant to this specification, and consume documents which are also conformant to this specification. The specification requires validFrom and validUntil to be formatted in a very specific manner, as outlined by [XMLSCHEMA11-2]. The specification is also requiring me to accept time values which aren't formatted according to this requirement. It puts me, the implementer, in a position where I can conform to the first statement or the latter.

PatStLouis commented 1 month ago

I think clarifying that A) Accepting incorrectly serialized time values applies to the consumer. B) The consumer MAY chose to accept them, but is not required to do so. C) If they do chose to accept them, the UTC is to be the interpreted timezone.

Would solve this issue.

Claiming that Time values that are incorrectly serialized without an offset MUST be interpreted as UTC. says that the [XMLSCHEMA11-2] is no longer applicable and leaves the formatting in a ambiguous state.

msporny commented 1 month ago

It puts my, the implementer, in a position where I can conform to the first statement or the latter.

Yes, that is a valid concern and we should fix the issue by making it clear that we expect conforming documents to follow the spec, but implementers can choose to consume values that don't follow the spec (but when doing so will interpret the non-conformant value in the same way).

There are variations of that text suggested above that will work. I think we're ready for a PR here if anyone wants to make an attempt at raising it.

PatStLouis commented 1 month ago

I'd also like to add that, in a case where an issuer wants to issue VC based on existing data which doesn't have offset, they should then also consider this statement and add the UTC offset to the data value they are issuing prior to adding the proof. I don't know if this was also something originally intended to be covered by that statement.

We had a situation where original documents only had a date as the issuance value, and we had to default to UTC timestamp since no other information was provided in the original document.

msporny commented 1 month ago

they should then also consider this statement and add the UTC offset to the data value they are issuing prior to adding the proof.

What a conforming processor does before encoding the information is out of scope. They can do anything they want as long as they encode the value appropriately. I don't think we should get into saying all of the things they can do (as it's a very large set of things). We generally try to keep our normative guidance focused on the conformance classes in the specification.

msporny commented 1 month ago

PR #295 has been raised to address this issue. This issue will be closed once PR #295 has been merged.

msporny commented 1 month ago

PR #295 has been merged, closing.