Define a Providence/Chain of Custody Information Module

jimsch commented 9 years ago

There needs to be some type of common information module that can be included by other modules that deals with the issues of providence and chain of custody.

This module should address the issues raised in the requirements draft by DM-008, DM-011, DM-014, DM-015, DM-016

djhaynes commented 7 years ago

Hi Jim,

Do you think the SACM Statement Metadata, Content-Element Metadata, and Event elements support this with the understanding that we still need to pick mandatory-to-implement elements that go in these constructs, but, that an implementer can extend them to support additional elements?

Thanks,

Danny

jimsch commented 7 years ago

Probably, but insufficient information at present.

djhaynes commented 7 years ago

What information in particular are you looking for? Is it the mandatory to implement elements that will be part of the SACM Statement Metadata and Content-Element Metadata elements?

jimsch commented 7 years ago

I am not sure how those statements are put together, what elements are needed to properly convey the needed providence, how they are composed when a second party uses elements to build a new element (are they composed?). So 1) what the templates really look like, 2) what elements are used for providence, 3) what cryptographic elements are required (if any), 4) potentially some processing rules for good/badness (not sure on that) as this this part of chain of custody. Some of this might get odd since it would imply either a canonical data model for streaming or some hand waving.

adammontville commented 7 years ago

At this point, how important is providence and chain of custody to our efforts?

david-waltermire commented 7 years ago

My feeling, and I believe this has been echoed by the WG in the past, is that we need to either: 1) provide extension points for addressing providence and chain of custody, allowing these aspects to be deferred; or 2) adopt an existing solution that addresses these issues. The key is not to get bogged down in addressing providence and chain of custody. Both paths allow us to focus on our actual problem domain, without spending a bunch of time of these important, but non-core aspects.

From: adammontville [mailto:notifications@github.com] Sent: Sunday, November 06, 2016 9:19 AM To: sacmwg/draft-ietf-sacm-information-model draft-ietf-sacm-information-model@noreply.github.com Subject: Re: [sacmwg/draft-ietf-sacm-information-model] Define a Providence/Chain of Custody Information Module (#8)

At this point, how important is providence and chain of custody to our efforts?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/sacmwg/draft-ietf-sacm-information-model/issues/8#issuecomment-258683700, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJaiaCO_LPCUNtm0ATGZjkbK7eIcHqQWks5q7eHUgaJpZM4D-_yf.

djhaynes commented 7 years ago

Dave, I think you are correct. However, I would just add that I think the WG does want to provide the information that enables provenance/chain-of-custody extensions/solutions.

jimsch commented 7 years ago

I am not sure that I would be worried about the concept of chain-of-custody. I don't know that it makes a great deal of difference to say that this attribute went from point A to point B to point C. The exception would be if one was wondering why thing did or did not show up in a reasonable time.

I think that provenance is of much greater importance. Consider the reporting of a binding between an IP address and a DNS name. The following people could report it: The DHCP server that did the assignment, the machine to which the address was reported or somebody who queried the DNS name at some location. While the first two need to match, the last one does not need to and should not be treated with the same level of correctness depending on where that entity is relative to the subnet that contains the machine with the DNS name (think going through a NAT). The fact that different people are reporting the same information needs to be clearly attached to the data.

sacm commented 7 years ago

Hi,

Thanks Jim. The NAT example is a perfect illustration of the fact that even DNS-to-IP binding is based on one's network/subnet perspective in the real world and not the invariant that non-administrators often assume.

Cheers,

Ira

Ira McDonald (Musician / Software Architect) Co-Chair - TCG Trusted Mobility Solutions WG Chair - Linux Foundation Open Printing WG Secretary - IEEE-ISTO Printer Working Group Co-Chair - IEEE-ISTO PWG Internet Printing Protocol WG IETF Designated Expert - IPP & Printer MIB Blue Roof Music / High North Inc http://sites.google.com/site/blueroofmusic http://sites.google.com/site/highnorthinc mailto: blueroofmusic@gmail.com Jan-April: 579 Park Place Saline, MI 48176 734-944-0094 May-Dec: PO Box 221 Grand Marais, MI 49839 906-494-2434

On Mon, Nov 7, 2016 at 4:39 PM, Jim Schaad notifications@github.com wrote:

I am not sure that I would be worried about the concept of chain-of-custody. I don't know that it makes a great deal of difference to say that this attribute went from point A to point B to point C. The exception would be if one was wondering why thing did or did not show up in a reasonable time.

I think that provenance is of much greater importance. Consider the reporting of a binding between an IP address and a DNS name. The following people could report it: The DHCP server that did the assignment, the machine to which the address was reported or somebody who queried the DNS name at some location. While the first two need to match, the last one does not need to and should not be treated with the same level of correctness depending on where that entity is relative to the subnet that contains the machine with the DNS name (think going through a NAT). The fact that different people are reporting the same information needs to be clearly attached to the data.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sacmwg/draft-ietf-sacm-information-model/issues/8#issuecomment-258971296, or mute the thread https://github.com/notifications/unsubscribe-auth/AKbE0e1W1AcJkfpO3KlapFg_z4h7fXJEks5q75qigaJpZM4D-_yf .

sacm mailing list sacm@ietf.org https://www.ietf.org/mailman/listinfo/sacm

djhaynes commented 7 years ago

Can we close out this issue since it sounds like we are in agreement that we do not need to define an explicit mechanism for provenance and chain-of-custody and because we have a mechanism to add any metadata IEs that we may need to the IM?

adammontville commented 7 years ago

+1

jimsch commented 7 years ago

If you want to close it - that is fine. However what I asked for was the elements to be defined which is not apparently going to be done.

djhaynes commented 7 years ago

Hi Jim, Looking at your original post, you mention a few requirements (assuming requirements -04 given the date of the post). I think the current IM captures at least some of the IEs for these requirements (if not completely).

• DM-008: ...MUST include the ability to identify data from a specific provider. o dataSource (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.32) o targetEndpoint (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.5) o dataOrigin (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.31) o statementType (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.31) o contentType (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.126). Currently, it says “type”. I think this needs to be updated to "contentType".

• DM-011: ...MUST include the ability for providers to identify the data origin and provide a method for provenance information to be captured and communicated. o dataOrigin (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.31) o Regarding more generic provenance constructs, implementers are free to define new IEs that capture their representation of what provenance means

• DM-014: ...SHOULD allow the provider to include the information's origination time. o creationTimestamp (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.120)

• DM-015: ...SHOULD allow the provider to include attributes defining how the data was generated (e.g. self-reported, reported by aggregator, scan result, etc.). o collectionTaskType (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.27) o

• DM-016: ...SHOULD allow the provider to include attributes defining the location of the data source. o dataSource (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.32) o targetEndpoint (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.5) o locationName (https://tools.ietf.org/html/draft-ietf-sacm-information-model-08#section-7.52)

Do these IEs align with your interpretations of the requirements?

Lastly, additional IEs can be included, as necessary, to satisfy the needs of the WG. If there are major gaps, please propose them and they can be incorporated into the IM.

Thanks,

Danny

jimsch commented 7 years ago

From that it looks like all except for the crypto aspects are present. However, I am not the person to say what does and does not need to be present and what does and does not need to be documented as a common IM understanding of how this information is setup.

sacmwg / draft-ietf-sacm-information-model

Define a Providence/Chain of Custody Information Module #8