datatype requests - Githubissues

djhaynes commented 8 years ago

At the 3/9/16 SACM Virtual Interim Meeting, there was a request for the following features with respect to datatypes in SACM.

Make URI a first class datatype
Make Array and Map primitive datatypes (based on experience with CBOR)

If anyone else has datatype requests for SACM, feel free to add them to this tracker.

mcokus commented 7 years ago

Concerning the URI part of the issue, we may need to discuss the scope. URIs cover more than web addresses (see RFC3986 - https://datatracker.ietf.org/doc/rfc3986/). Does the datatype need to address all URIs in general or just web addresses? For example, the following are all URIs: ftp://ftp.is.co.za/rfc/rfc1808.txt http://www.ietf.org/rfc/rfc2396.txt ldap://[2001:db8::7]/c=GB?objectClass?one mailto:John.Doe@example.com tel:+1-816-555-1212 urn:oasis:names:specification:docbook:dtd:xml:4.1.2

mcokus commented 7 years ago

Concerning the Array/Map part of the issue, it's not clear what "based on experience with CBOR" means in the context of an information model, since CBOR is a specific data format. I'm thinking the intention is to describe the datatypes in a way that makes implementation in CBOR easier. If that's the case, then we could pattern Map and Array on JSON, since CBOR can be use to define compact representations of JSON structures.

henkbirkholz commented 7 years ago

"based on experience with CBOR": While CBOR is a encoding/dataformat, it can be described using the language CDDL - maybe that was implied here?. Because, CDDL can already represent both JSON and CBOR structures natively (and JSON uses a totally different serialization). This is for two primary reasons:

1) CBOR is comparable to JSON, because it has a superset of JSON's ability, it just.... serializes to a binary format. 2) More importantly, they both therefore share a huge portion of the underlying information model. That is why the language CDDL can represent them both, the language already abstracts (the CBOR-specific prelude excluded, of course) from the actual encoding, which is what an information model usually does.

In any case, the characteristics of arrays and map are the same in both encodings. If those are meant, I agree that the semantics can probably be derived from the corresponding definitions, as Mike already suggested:

https://tools.ietf.org/html/rfc4627#section-2.2 https://tools.ietf.org/html/rfc4627#section-2.3 (and Major Type 4 & Major Type 5 in https://tools.ietf.org/html/rfc7049)

jimsch commented 7 years ago

I think that this could probably have been written just as easily as "based on experience with JSON". I just read it as we should have these types available because we have found them useful. They are available (I think) as primitives (ordered vs unordered lists?) so I think that it has been satisfied.

henkbirkholz commented 7 years ago

Well, as the definition says: Maps are an unordered set of AVP that only allows exactly one instance of each attribute, but provides unambigous semantics. An array is an ordered set of values that allows combination of sequences of values with the same type, but therefore also allows for ambigous semantics.

I am in doubt that the ordered and unordered lists defined by the IM draft are either of those. I am also not sure if we need them, though. In general, it does not hurt to enable the use of JSON, I'd say.

jimsch commented 7 years ago

An unordered list is defined as - here are a set of name and values pairs that can occur in any order in the list. To me this is semantically the same as a MAP.

An ordered list is defined as - here are a set of name and value pairs that can occur in a fixed order in the list. This is basically the same thing as what an array looks like assuming that you have the ability to deal with elements in the middle that are absent. This could be done by the use of tagging or nil points in the array to say that the field is not there. That seems to be to be the semantic equivalent of an array, but I can see some people might not agree with that.

sacm commented 7 years ago

Since this is in regard to an IM, the most generic concept is a collection. Lists, maps, trees, and other data structures are all types of collections.

A collection is a group (not necessarily a set!) of zero or more data items that have some shared semantics, and which need to be operated on together.

A list represents a countable number of values. Lists are typically ordered, but if you want to define ordered and unordered lists, then the key difference is that the former defines a function that controls the occurrence (i.e., order) of each member of the list, whereas the latter either has no such function, or the function is random (I dislike having a random function). Lists typically do NOT have distinct values - both ordered and unordered lists can have duplicates.

Lists are typically finite. Streams are infinite lists.

An array is a collection of elements, where each element is selected by one or more keys (indices). Hence, an array is NOT the same as a list.

A dynamic array is an array whose size can change at runtime.

A map (a.k.a dictionary) is a collection of {key, value} pairs, such that each key appears no more than once in the collection. Hence, it is NOT the same as an array - their behaviors are different, and arrays can be multi-dimensional.

From an IM point-of-view, I would suggest that we use abstract data types to define these more precisely, though this may be too formal for the WG. Basically, an Abstract Data Type is a mathematical model of a data structure where its semantics (i.e., behavior) is defined in terms of possible values and operations on data of this type from the point-of-view of a user of the data. This is contrasted with data structures, which are defined from the point-of-view of the implementor.

HTH, John

On Tue, Jan 10, 2017 at 3:10 PM, Jim Schaad notifications@github.com wrote:

An unordered list is defined as - here are a set of name and values pairs that can occur in any order in the list. To me this is semantically the same as a MAP.

An ordered list is defined as - here are a set of name and value pairs that can occur in a fixed order in the list. This is basically the same thing as what an array looks like assuming that you have the ability to deal with elements in the middle that are absent. This could be done by the use of tagging or nil points in the array to say that the field is not there. That seems to be to be the semantic equivalent of an array, but I can see some people might not agree with that.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sacmwg/draft-ietf-sacm-information-model/issues/37#issuecomment-271727933, or mute the thread https://github.com/notifications/unsubscribe-auth/AKbE0e2qQ7JMzX1L94Ym6GLo3_-q_Owrks5rRA_TgaJpZM4Hw9Um .

sacm mailing list sacm@ietf.org https://www.ietf.org/mailman/listinfo/sacm

-- regards, John

jimsch commented 7 years ago

Can you generate a pull request so we can see how this would be different from what is current in the document?

sacmwg / draft-ietf-sacm-information-model

datatype requests #37