I submitted a couple of issues on the Python STIX2 implementation repo [0][1] which were the result of my misunderstanding the point of default values, specifically when applied to optional values. I had been under the impression that default values were an implementation convenience rather than something belonging to the STIX2 data model itself, and was suggesting that it should be easier to "unset" them when constructing new versions of objects using the Python implementation.
I was being foiled by the fact that, in a custom SCO definition, I had defined an optional timestamp field which defaulted to the current time; and I was unable to make that property go away for the purpose of naive object comparison (without those pesky timestamps which cause spurious diffs in my use cases) without serialising the object out of STIX into a native mapping object.
With a bit of discussion, @clenk pointed out the snippet which was added to section 3.6 regarding representations of STIX objects as a result of #150. This made it a bit more clear to me that default values aren't just implementation convenience and something a bit more embedded in the STIX data model. I also realised that part of my issue was that I was leaking information about a measuring system which wasn't meaningful to most of my observables of this type into the SCOs, and the spurious diffs were a symptom of that problem. I've since adjusted how I define my custom objects to be less pathological and life is good again.
The point of me making this issue is to see if it's worth adding some more content to the spec to describe that. It seems to me after jumping around a bit in the CS03 document, that default values are alluded to and used but not well described. I thought out loud a little in my most recent comment on [0] and I think in my perfect world, I'd like to see some or all of the following added to the spec:
A subsection in section 3 specifying default values as an explicit part of the data model
It could also define presence vs absence of a property, to clarify if a property with a default is, by definition, always present?
It should roll up/reference the line added to section 3.6 from #150
Subsection 2.9 on identifier generation could be updated
To describe how required, optional with defaults, and optional without default properties SHOULD be treated for deterministic ID generation (that entire bit is a SHOULD so...)
Maybe also to clarify the absent properties are not to be included in the canonical JSON used as the UUIDv5 name?
The value of the name portion SHOULD be the list of "ID Contributing Properties" (property-name and property value pairs) as defined on each SCO object
The above could be reworded like which are present on each SCO object or have default values, capturing both points by excluding absent properties without defaults
Some extra dot points in 12.1 conformance to capture how producers and consumers should/must treat default values
The subsection already specifies interop requirements for required properties
Does defining that "if a consumer supports parsing optional properties with defaults, they must be presented as the property value if the property is absent in the representation" make any sense here, or is it better to leave that up to implementations? It's kind of an API thing, and could be solved with implementation specific documentation.
I submitted a couple of issues on the Python STIX2 implementation repo [0][1] which were the result of my misunderstanding the point of default values, specifically when applied to optional values. I had been under the impression that default values were an implementation convenience rather than something belonging to the STIX2 data model itself, and was suggesting that it should be easier to "unset" them when constructing new versions of objects using the Python implementation.
I was being foiled by the fact that, in a custom SCO definition, I had defined an optional timestamp field which defaulted to the current time; and I was unable to make that property go away for the purpose of naive object comparison (without those pesky timestamps which cause spurious diffs in my use cases) without serialising the object out of STIX into a native mapping object.
With a bit of discussion, @clenk pointed out the snippet which was added to section 3.6 regarding representations of STIX objects as a result of #150. This made it a bit more clear to me that default values aren't just implementation convenience and something a bit more embedded in the STIX data model. I also realised that part of my issue was that I was leaking information about a measuring system which wasn't meaningful to most of my observables of this type into the SCOs, and the spurious diffs were a symptom of that problem. I've since adjusted how I define my custom objects to be less pathological and life is good again.
The point of me making this issue is to see if it's worth adding some more content to the spec to describe that. It seems to me after jumping around a bit in the CS03 document, that default values are alluded to and used but not well described. I thought out loud a little in my most recent comment on [0] and I think in my perfect world, I'd like to see some or all of the following added to the spec:
name
?which are present on each SCO object or have default values
, capturing both points by excluding absent properties without defaultsHopefully this isn't total nonsense :)
===== 8< ===== [0] oasis-open/cti-python-stix2#507 [1] oasis-open/cti-python-stix2#508