oasis-tcs / openc2-jadn

OASIS OpenC2 TC: Specifing a vocabulary to describe the meaning of structured data, to provide hints for user interfaces working with structured data, and to make assertions about what a valid instance must look like. https://github.com/oasis-tcs/openc2-jadn
Other
5 stars 2 forks source link

String Formats, Patterns and Min/Maxv #69

Closed maroberts82 closed 3 months ago

maroberts82 commented 11 months ago

minv / maxv make sense for a plain String type, to specify the minimum & maximum string length, although when a Format or Pattern parameter is applied to a String type, their meaning becomes ambiguous.

Use Case: A user selects String, then selects the Format Date, then enters a minv of 10 and finally enters a date pattern (YYYY-mm-dd).
Issue/question: Does minv 10 remain the minimum character count or is this the min date allowed? Issue/question: If the user does not select a pattern, what date format is assumed?

Example: JADN ["String-Name", "String", ["/date", "{1", "}8"], ""]

XSD

<xsd:simpleType>
   <xsd:restriction base="xsd:date">  
      <xsd:minInclusive value="1979-01-01"/>
      <xsd:maxInclusive value="1979-01-08"/>
   </xsd:restriction>
</xsd:simpleType>
davaya commented 11 months ago

The minv/maxv options apply to the underlying type - for String and Binary types they are the minimum and maximum number of characters/bytes in the string / byte-string. For Integer types they are the minimum and maximum values. (For Number types a separate minf/maxf option is currently defined because the values are different. But that's a hack I'd like to get rid of and use minv/maxv for both Integer and Number types.)

For a datetime serialized as Integer milliseconds the way OpenC2 does, the min/max values would look like 1698685605000, which can be converted to a string in many formats, some of which are:

Issue/question: Does minv 10 remain the minimum character count or is this the min date allowed?

If the type is String, all of minv/maxv, pattern and /format are applied, which for many combinations results in nothing matching. In some cases like a pattern \d+, a maxv would limit the number of digits. But best practice is to never use more than one constraint option unless you know what you are doing. For String /formats, all JSON Schema semantic validation keywords are "supported" (assuming the application uses a JSON Schema validator). But as reading the spec indicates, semantic validation with /format is not a simple task. We should add the XSD date format keywords, but that won't make format processing any simpler.

Issue/question: If the user does not select a pattern, what date format is assumed?

For String with no pattern, anything (e.g., "abcdefg") matches since there's nothing to indicate that it's a date. For Integer with no /format, it's just any integer within the minv/maxv range if those options are present. There would need to be ten different /format options defined to support converting integers to date strings in ten different formats, which is what RFC 3339 was intended to help solve. Neither JADN nor OpenC2 currently define any Integer /date format; it's up to the application to decide how to display a POSIX date.

maroberts82 commented 11 months ago

Thank you @davaya for your reply and feedback. If a JADN user goes do the path of an integer type for a date, and then, uses minv and maxv options to set the allowable range this aligns well.

Although, if a JADN user uses a string type with a date format, is there a way to set the allowable date range, since minv and maxv are used as character counts string types?

What are your thoughts on introducing date and date-time types to JADN, basically pulling them up from the string format options? Or is this too high level?

Or perhaps a mindate maxdate option when the date format is used.

davaya commented 11 months ago

@maroberts82 If it were a program (like an airline reservation web site with departure and return dates) they would always convert strings to integers for processing, and then back to strings for web display. So attempting to do any computing on date strings sounds like a non-starter to me.

If there any libraries (ask Kaitlyn about React :-) that do date comparisons or sorting without converting to integers, I'd be shocked. But I'm willing to learn.

maroberts82 commented 11 months ago

@davaya you are spot on, dates are typically converted to strings or millis on the backend.

I guess if a user wants to use a date range restriction, then they will need to use an integer type, date format and then, the minv maxv milis since epoche January 1, 1970.

And if they use a string type, date format, then the minv, maxv cannot be used for a date range?

davaya commented 11 months ago

In our Metaschema review, DW pointed out that that there are two epochs, with and without a known timezone (e.g., gmtime and localtime). So as we're developing JADN formats we'll need to be able to specify both the units (seconds, milliseconds, microseconds) and GMT/local epoch.

davaya commented 3 months ago

No changes needed.