Closed jimsch closed 8 years ago
Hi Jim, I am not finding a reference to Providence in DM-008 so am not sure which requirement this refers to?
Thanks, Nancy
The reference in item 1 should have been DM-011 not DM-008. Sorry about that.
Ah, yes....I agree! I can remove this one....
Thanks, Nancy.
These are two separate items and both need to be retained; clearly, we do need to clarify them. The requirement for DM-008 is to be able to identify data from a specific provider - for example, to allow SACM consumer to request all data from Provider A, or data only from Provider A among a set of providers. The requirement in DM-011 is to be able to identify the original source of the data - e.g. endpoint self-reported data vs. vs. data collected from an endpoint by some third party vs. externally-observed data.
Hi Lisa, I like your clarifications and can keep both with your examples for better clarity. But I think I will rename DM-011 from "Provenance" to "Data Source" as I am not sure that we can really provide a full chain of custody (my definition of provenance) of a piece of data as the chain may be very large, but rather to have the provider specify per your example whether it is the originator (self-reported) vs. collected from some other source w/o forcing full chain of custody.
OK?
Nancy.
We still need text to do the clarifications from Lisa.
And I agree that description is source and not provenance.
Thanks, Kathleen
On Thu, May 14, 2015 at 1:17 AM, Jim Schaad notifications@github.com wrote:
We still need text to do the clarifications from Lisa.
— Reply to this email directly or view it on GitHub https://github.com/sacmwg/draft-ietf-sacm-requirements/issues/31#issuecomment-101916078 .
sacm mailing list sacm@ietf.org https://www.ietf.org/mailman/listinfo/sacm
Best regards, Kathleen
Hi, I did make changes in -05, what used to be DM-008 renumbered and I updated it as:
DM-006 Provider identification: The interfaces and actions in the data model MUST include the ability to identify data from a specific provider. For example, a SACM consumer should be able to request all data to come from a specific provider (e.g. Provider A) as there can be a larger set of providers.
and the DM-011 Provenance was renumbered, renamed and updated text is now: DM-009 Data source: The data model MUST include the ability for providers to identify the data origin. For example, a provider endpoint could share self-reported data vs. data collected from a different SACM endpoint or by some externally-observed data.
Does that help, or do you need further clarifications?
Thanks, nancy
Suggestion in DM-006- s/(e.g. Provider A)/(e.g. a specific collector or evaluator)/
In DM-011 - Is the data being shared by a provider endpoint, or is the data being shared about the provider endpoint?
For example, an endpoint could share self-reported data, a data collector could share information about the endpoint or externally configured data could be shared about the endpoint.
Hi Jim,
DM-011 is about origination time, so I'm not sure what you being by your comment so am thinking its not the right DM requirement you're referring to?
Thanks, Nancy
For consistency I am still referring to it by the original number. So it is now DM-09.
Thanks. So, the text does refer to the SACM component as being the provider of information; would it help if it was updated to state "For example, a provider endpoint could share self-reported data (e.g. in the case of it also being the target endpoint) vs. data collected from a different SACM endpoint or by some externally-observed data"
To answer your question, the provider could be doing either; but the intent is for the data to be shared by a provider.
The proposed text for DM-009 (was DM-011) is now: "The data model MUST include the ability for providers to identify the data origin. For example, a provider endpoint could share self-reported data (e.g. in the case of it also being the target endpoint) vs. data collected from a different SACM endpoint or by some externally-observed data".
Does the proposed text imply and differentiate the following three (potential) scenarios? 1.) "a provider endpoint could share self-reported data (e.g. in the case of it also being the target endpoint)": I.e. there is a SACM component located on the Target Endpoint that collects data locally and provides that data to other SACM components. 2.) "data collected from a different SACM endpoint": I.e. there is a SACM component located on an endpoint different than the Target Endpoint that collects data about the Target Endpoint via (and this is where maybe clarification is needed) native interfaces/remote API over the network, and provides that data to other SACM components. 3.) "data collected from a different SACM endpoint or by some externally-observed data": I.e. there is a SACM component located on an endpoint different than the Target Endpoint that can observe network behavior about the Target Endpoint, and provides that data to other SACM components.
If so, the scope of "data origin" exceeds "which Endpoint is it coming from". As it is phrased now, the scope of the concept "data origin" seems to include something like "acquisition method" or even "method data is created with". This should be clarified or at the very least be addressed and elaborated in drafts that build on this requirement.
In any case, there seems to be a strong relationship between this proposed text and Issue https://github.com/sacmwg/draft-ietf-sacm-terminology/issues/11 ("categories of Endpoint Attributes that depend on their provenance/origin") - which is good!
In the context of this draft, terms, such as "self-reported" or "provider endpoint", could be deictic terms and are probably not completely self-explaining. An option to remedy this without making the text in the requirements draft more complicated could be to include them in the terminology draft (update -07) and then map them to more precise definitions that use standardized (and agreed upon) terms.
I am not overly happy with the "For example..." text. Part of the problem is the way that 'endpoint' is used as a term within SACM. Clearer text to me would be:
For example, a provider needs to differentiate between data that is self-reported vs data that is collected by a different SACM endpoint by tagging the identity of the SACM endpoint that reported the data. The origin of the data may change the way that data is versioned on the provider, as the data reported by different entities can be different.
I find the use of 'endpoint',while technically correct, in conjunction with provider to be confusing. The reason is that I generally think of endpoints as what data is being collected about rather than being the collection of everything that is on the system (and perhaps even better than that). Note that 'provider' is in the terminology draft.
Discussed at 6/29 virtual interim, done in -07
Is there a reason that the "For Example" text was deleted rather than modified? s/(provider/(i.e. the provider/
Not deleted, modified. DM-006 still has an example, it just doesn't say "for example" in front of it.
wfm
Version -04