oras-project / artifacts-spec

Apache License 2.0
63 stars 30 forks source link

Date format for `io.cncf.oras.artifact.created` #111

Open SteveLasker opened 2 years ago

SteveLasker commented 2 years ago

The artifacts-spec defines the io.cncf.oras.artifact.created annotation for date/time sorting of artifacts, based on a created date.

io.cncf.oras.artifact.created date and time on which the artifact was created (string, date-time as defined by RFC 3339)

The annotation is currently is defined with RFC 3339

This issue re-opens the discussion if there are valid reasons to switch to another format, such as:

The created date will be used by the referrers API to sort (asc or desc) references. For instance, a list of scan results could be sorted to get the first (original), or last (current). A list of annotations may be added, where the same value is set to a new value, over time. A user may want the original or latest value of the annotation.

michaelb990 commented 2 years ago

RFC 3339 describes the Internet Date/Time Format which conforms to ISO 8601. What is this issue proposing to change?

https://datatracker.ietf.org/doc/html/rfc3339#section-5.6

huanwu commented 2 years ago

The backend store for the manifest (cosmos) doesn't support datetime directly and the datetime is stored as strings. The only recommended format for DateTime strings in Azure Cosmos DB is yyyy-MM-ddTHH:mm:ss.fffffffZ which follows the ISO 8601 UTC standard. Converting the date strings to this format will allow sorting dates lexicographically.

https://docs.microsoft.com/en-us/azure/cosmos-db/sql/working-with-dates

There's no clear document if RFC 3339 is lexicographically sortable. It might work for most of the cases.

michaelb990 commented 2 years ago

Maybe I'm missing something. Can someone explain (or point me to) what the difference is?

RFC 3339 says:

The following profile of ISO 8601 [ISO8601] dates SHOULD be used in new protocols on the Internet.

huanwu commented 2 years ago

I referred this doc to compare ISO8601 and RFC3339 https://ijmacd.github.io/rfc3339-iso8601/#:~:text=RFC%203339%20is%20case%2Dinsensitive,the%20standard%20allows%20arbitrary%20precision.

At least, the following two points will make the sort result a little different.

We need exactly follow yyyy-MM-ddTHH:mm:ss.fffffffZ format, which is lexicographically sortable to get the accurate sorting result.

SteveLasker commented 2 years ago

@huanwu, so you're mostly focused on the diff between T/t and Z/z? What does sorting look like when T and t are interchanged? Are we expecting other characters, as opposed to numbers, fit in the middle and change the ordering? Or, is it sorting of different times, where the T and t are jumbled?

huanwu commented 2 years ago

Yes, the concern is T and Z. Not only T/t and Z/z, it also allow use other character to replace T. In your sample: 2022-07-12T00:22 - 1 2022-07-12T01:00 - 2 2022-07-12T02:00 - 3

-> 2022-07-12T00:22 - 2 2022-07-12A01:00 - 1 2022-07-12T02:00 - 3

That means the hour, minute, seconds, mili-seconds sequence are not honored.

michaelb990 commented 2 years ago

Does either standard guarantee that you'll be able to sort without processing?

Looking at your comparison chart, something like 20220714T092455.405000 would also be ISO8601 compliant. I think we have to either make this a specific format compliant with one or both of these standards (e.g. yyyy-MM-ddTHH:mm:ss.fffffffZ ONLY) or rely on registries parsing it and converting it into a sortable datetime representation.

SteveLasker commented 2 years ago

Thinking about how registries would index the information, I think we're discussing how it's converted to what various implementations would use. For instance, if someone wanted to use lexicographically sorting, it would need to convert the RFC 3339 format to the indexing format. That might be converting to upper case, etc. Which I just realized @michaelb990 is saying the same thing. :)