CycloneDX / specification

OWASP CycloneDX is a full-stack Bill of Materials (BOM) standard that provides advanced supply chain capabilities for cyber risk reduction. SBOM, SaaSBOM, HBOM, AI/ML-BOM, CBOM, OBOM, MBOM, VDR, and VEX
https://cyclonedx.org/
Apache License 2.0
356 stars 56 forks source link

RFC: known unknowns #48

Closed coderpatros closed 3 years ago

coderpatros commented 3 years ago

Some possible scenarios:

Feel free to add more

coderpatros commented 3 years ago

Maybe something like this to indicate that we know there are unknown dependencies in acme-app?

        <dependency ref="acme-app" complete="false">
            <dependency ref="pkg:maven/org.acme/web-framework@1.0.0"/>
            <dependency ref="pkg:maven/org.acme/persistence@3.1.0"/>
        </dependency>
coderpatros commented 3 years ago

The I have no idea scenario would be...

        <dependency ref="acme-app" complete="false"/>

And no dependencies...

        <dependency ref="acme-app" />

Or should we go the other way, if complete is omitted it is assumed to be incomplete? That's probably a safer stance.

stevespringett commented 3 years ago

I think there are a minimum of three states we want to capture: COMPLETE, INCOMPLETE, and UNKNOWN.

We might want a fourth state called NOT_SPECIFIED.

If we go with the first three, then the spec should likely default to UNKNOWN. If we include the fourth state, then the spec should likely default to NOT_SPECIFIED.

The following is already required in order to state a component has no dependencies. That's already in the spec.

<dependency ref="acme-app" />

The capability this ticket describes overlaps a bit with #35, but I think it's extremely important to capture this information in the core spec.

I think there is another use case we need to consider for this ticket however, and that's where a component includes another component (assembly). Again, we would default to UNKNOWN or NOT_SPECIFIED depending on what states we adopt.

Examples of the assembly use case are:

<component>
  ...
  <!-- defaults to UKNOWN or NOT_SPECIFIED -->
</component>

The following is asserting that the component does not include other components

<component>
  ...
  <components completeness="COMPLETE"/>
</component>

The following is asserting that the component may include other components other than the ones specified

<component>
  ...
  <components completeness="UNKNOWN">
    <component>
      ...
    </component>
  </components>
</component>

I'm not a huge fan of the word complete or completeness, and introducing assertion vocabulary will potentially conflict with some of the functionality we'll introduce in #35.

Looking for suggestions on alternatives. If there are none, thats fine to. Just voicing personal preference.

Perhaps @christophergates has an opinion on this ticket?

coderpatros commented 3 years ago

Do we need something at the component level too?

As in, we've manually created this SBOM for an old device/application. And we think we have component x v1.2, but we're not entirely sure of the version/supplier/license/etc.

I like having a NOT_SPECIFIED default in addition to UNKNOWN. There's a subtle difference between the two. Especially when you are describing an opaque component and really have to idea what's in it.

Maybe PARTIAL and ALL as alternatives?

stevespringett commented 3 years ago

As in, we've manually created this SBOM for an old device/application. And we think we have component x v1.2, but we're not entirely sure of the version/supplier/license/etc.

Yes I think we need that to. But it's not completeness, this case would be accuracy or a similar word. Is there a way to roll these two very similar concepts together?

stevespringett commented 3 years ago

One idea is to use assert to describe both.

If assert appears on a plural array (e.g. components, licenses, services), then use the enum for completeness.

If assert appears on an object (component, license, service, etc), then use the enum for accuracy.

I don't think it would be possible to assert an individual field without breaking backward compatibility, or relying on pre-defined formatted strings, which is what SPDX does. We need to avoid that approach at all costs as validation becomes much more difficult. Besides, I think #35 would provide better value anyway instead of mere field-level assertions.

We could also use individual assert enums such as:

Question... What is the best way to describe ranges of accuracy?

Some ideas:

stevespringett commented 3 years ago

Because we're getting into the accuracy conversation, I'm wondering if fully supporting known-unknowns (both for completeness and accuracy) should be in the core spec or if there should be an assertion extension.???

If we decide to keep it in the core spec, how do we support it while remaining true to the guiding principals?

stevespringett commented 3 years ago

BTW, I found https://github.com/spdx/spdx-spec/issues/137 and SPDX is only looking at the known-unknowns from a very narrow lens. We can do better.

coderpatros commented 3 years ago

Maybe we should break this out to a separate issue. I think it’s sufficiently different to say we have component X and we can’t be sure about what’s in it vs we think we have component X, etc.

christophergates commented 3 years ago

Yeah I agree. This is all about communicating more information to the consumer who can then decide on how much risk they want to tolerate.

Christopher Gates


Director of Product Security

www.velentium.com

(805)750-0171

520 Courtney Way Suite 110

Lafayette CO. 80026

(GMT-7)

Our new book is now shipping:

Medical Device Cybersecurity for Engineers and Manufacturers

U.S. https://us.artechhouse.com/Medical-Device-Cybersecurity-A-Guide-for-Engineers-and-Manufacturers-P2128.aspx | Worldwide https://uk.artechhouse.com/Medical-Device-Cybersecurity-A-Guide-for-Engineers-and-Manufacturers-P2073.aspx

Amazon https://www.amazon.com/Medical-Device-Cybersecurity-Engineers-Manufacturers/dp/1630818151/ref=sr_1_1?dchild=1&keywords=Axel+Wirth&qid=1592335625&sr=8-1 & Digital https://us.artechhouse.com/Medical-Device-Cybersecurity-for-Engineers-and-Manufacturers-P2174.aspx

Security Book Of The Year! https://engineering.tapad.com/the-best-information-security-books-of-2020-e7430444fbd4

“If everyone is thinking alike, then somebody isn't thinking.” -George S. Patton

"Facts are stubborn things." -John Adams, 1770

------ Original Message ------ From: "Patrick Dwyer" @.> To: "CycloneDX/specification" @.> Cc: "Christopher Gates" @.>; "Mention" @.> Sent: 4/3/2021 6:10:55 AM Subject: Re: [CycloneDX/specification] RFC: known unknowns (#48)

Maybe we should break this out to a separate issue. I think it’s sufficiently different to say we have component X and we can’t be sure about what’s in it vs we think we have component X, etc.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CycloneDX/specification/issues/48#issuecomment-812856839, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHDLM6NGWBQHSCUBOT7MSXLTG4AU7ANCNFSM4YNNI3DA.

-- Disclaimer: The information and attachments transmitted by this e-mail are proprietary to Velentium, LLC and the information and attachments may be confidential and legally protected under applicable law and are intended for use only by the individual or entity to whom it was addressed. If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message and attachments is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and delete this message from your system immediately hereafter.

stevespringett commented 3 years ago

The Core team met and made a decision that this ticket will be explicitly for support of known-unknowns with regard to relationships.

Known-unknowns in terms of what the attributes of a component are (name, version, supplier, etc) will be handled through the audit extension which will support evidence. This approach aligns with the CycloneDX guiding principals.

Now that the scope of the ticket is strictly about relationships, there are still a few outstanding questions...

The enums proposed in https://github.com/CycloneDX/specification/issues/48#issuecomment-798980798 are still applicable. They are:

Do we want to address:

If we want to address these things, then one proposal would be to support two variations of INCOMPLETE:

stevespringett commented 3 years ago

There's also a corner case to this... A BOM that has UNKNOWN completeness and also is only including third-party or is redacting. So in a few corner cases, two or more enum values could potentially be used.

stevespringett commented 3 years ago

Question: If completeness is specified on an array of components, services, or dependencies, completeness can either:

A) Apply to the current branch of the tree only B) Cascade to all child branches

Or, CASCADE could be an additional property with a boolean value.

My opinion would be to have it apply only to the current branch (option A) unless CASCADE is true. Thoughts?

coderpatros commented 3 years ago

I vote for A over B. I think in a lot of cases people will know what that the top level is complete. But can’t be sure about anything past that.

And adding an optional CASCADE property I think is a premature optimisation.

stevespringett commented 3 years ago

A) it is. I also did not implement cascade. Implementation documentation clearly states it does not cascade.

The proposed implementation which is consistent across XML and JSON is implemented as a new bom node called compositions.

Compositions describe constituent parts (including components, services, and dependency relationships) and their completeness.

<compositions>
    <composition>
        <aggregate>complete</aggregate>
        <assemblies>
            <assembly ref="pkg:maven/partner/shaded-library@1.0"/>
        </assemblies>
        <dependencies>
            <dependency ref="acme-application-1.0"/>
        </dependencies>
    </composition>
    <composition>
        <aggregate>unknown</aggregate>
        <assemblies>
            <assembly ref="pkg:maven/acme/library@3.0"/>
        </assemblies>
    </composition>
</compositions>
"compositions": [
    {
      "aggregate": "complete",
      "assemblies": [
        "pkg:maven/partner/shaded-library@1.0"
      ],
      "dependencies": [
        "acme-application-1.0"
      ]
   },
   {
     "aggregate": "unknown",
     "assemblies": [
       "pkg:maven/acme/library@3.0"
     ]
   }
]

Possible aggregate values are:

coderpatros commented 3 years ago

I was thinking this would have been added as an attribute to components and dependencies elements.

They're both arrays in JSON which is problematic. Is this what you meant about challenges keeping the approach consistent between XML and JSON formats?

stevespringett commented 3 years ago

I was thinking this would have been added as an attribute to components and dependencies elements.

Thats what I tried to do. It did not work out so well. Everything centers around limitations of JSON (the format) and JSON Schema.

In the components array - this is an array of objects and JSON allows having multiple different types of objects in the array. When validation errors occur, it's difficult to determine why something failed validation if the schema specifies multiple object types, but it is possible. What is not possible is to say that I can have 0..n components followed by 0..1 composition (or whatever we call the object). This is not supported by JSON Schema draft 07 (the latest stable version), but is supported by newer drafts via 'maxContains'. Unfortunately, the newer JSON Schema versions have very limited tooling support. There's a high probability that if we adopt a newer JSON Schema version, that a large percentage of adopters will not be able to use it.

For the dependencies array - this is an array of strings, so it's not possible to include objects in the array like I could with the component array. We would have to change dependencies to support the existing array of strings OR be an array of objects, and both would be valid. This would impact all of the implementations as they would need to support both - but only for JSON.

I'm continuously surprised by the lack of maturity of JSON and JSON Schema given how old it is. These limitations are what lead me to make compositions a bom level node rather than having composition information in the components or dependencies themselves. I could not find a way to incorporate that information in a consistent and usable way.

christophergates commented 3 years ago

my nice late response... yes I concur with A

Christopher Gates


Director of Product Security

www.velentium.com

(805)750-0171

520 Courtney Way Suite 110

Lafayette CO. 80026

(GMT-6)

Our new book is now shipping:

Medical Device Cybersecurity for Engineers and Manufacturers

U.S. https://us.artechhouse.com/Medical-Device-Cybersecurity-A-Guide-for-Engineers-and-Manufacturers-P2128.aspx | Worldwide https://uk.artechhouse.com/Medical-Device-Cybersecurity-A-Guide-for-Engineers-and-Manufacturers-P2073.aspx

Amazon https://www.amazon.com/Medical-Device-Cybersecurity-Engineers-Manufacturers/dp/1630818151/ref=sr_1_1?dchild=1&keywords=Axel+Wirth&qid=1592335625&sr=8-1 & Digital https://us.artechhouse.com/Medical-Device-Cybersecurity-for-Engineers-and-Manufacturers-P2174.aspx

Security Book Of The Year! https://engineering.tapad.com/the-best-information-security-books-of-2020-e7430444fbd4

“If everyone is thinking alike, then somebody isn't thinking.” -George S. Patton

"Facts are stubborn things." -John Adams, 1770

------ Original Message ------ From: "Steve Springett" @.> To: "CycloneDX/specification" @.> Cc: "Christopher Gates" @.>; "Mention" @.> Sent: 4/9/2021 8:00:11 PM Subject: Re: [CycloneDX/specification] RFC: known unknowns (#48)

Question: If completeness is specified on an array of components, services, or dependencies, completeness can either:

A) Apply to the current branch of the tree only B) Cascade to all child branches

Or, CASCADE could be an additional property with a boolean value.

My opinion would be to have it apply only to the current branch (option A) unless CASCADE is true. Thoughts?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CycloneDX/specification/issues/48#issuecomment-817055985, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHDLM6LIUXXAYN23O3APZB3TH6WKXANCNFSM4YNNI3DA.

-- Disclaimer: The information and attachments transmitted by this e-mail are proprietary to Velentium, LLC and the information and attachments may be confidential and legally protected under applicable law and are intended for use only by the individual or entity to whom it was addressed. If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message and attachments is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and delete this message from your system immediately hereafter.

coderpatros commented 3 years ago

Continuing a point raised in the IWG. Should we include INCOMPLETE_REDACTED?

What value does it provide to someone consuming a BOM over one of the other incomplete options?

Once it's in it will be hard to remove. But if we leave it out it will be easy to add.

I'm just on the fence.

stevespringett commented 3 years ago

Same. On the fence. I think this is another legitimate use case that has the potential to be misused.

christophergates commented 3 years ago

What value does it provide to someone consuming a BOM over one of the other incomplete options? Exactly, this does not convey any new information to the consumer except that they are never going to give you any more information about it, which is rather snarky!

Christopher Gates


Director of Product Security

www.velentium.com

(805)750-0171

520 Courtney Way Suite 110

Lafayette CO. 80026

(GMT-6)

Our new book is now shipping:

Medical Device Cybersecurity for Engineers and Manufacturers

U.S. https://us.artechhouse.com/Medical-Device-Cybersecurity-A-Guide-for-Engineers-and-Manufacturers-P2128.aspx | Worldwide https://uk.artechhouse.com/Medical-Device-Cybersecurity-A-Guide-for-Engineers-and-Manufacturers-P2073.aspx

Amazon https://www.amazon.com/Medical-Device-Cybersecurity-Engineers-Manufacturers/dp/1630818151/ref=sr_1_1?dchild=1&keywords=Axel+Wirth&qid=1592335625&sr=8-1 & Digital https://us.artechhouse.com/Medical-Device-Cybersecurity-for-Engineers-and-Manufacturers-P2174.aspx

Security Book Of The Year! https://engineering.tapad.com/the-best-information-security-books-of-2020-e7430444fbd4

“If everyone is thinking alike, then somebody isn't thinking.” -George S. Patton

"Facts are stubborn things." -John Adams, 1770

------ Original Message ------ From: "Patrick Dwyer" @.> To: "CycloneDX/specification" @.> Cc: "Christopher Gates" @.>; "Mention" @.> Sent: 4/20/2021 2:32:32 PM Subject: Re: [CycloneDX/specification] RFC: known unknowns (#48)

Continuing a point raised in the IWG. Should we include INCOMPLETE_REDACTED?

What value does it provide to someone consuming a BOM over one of the other incomplete options?

Once it's in it will be hard to remove. But if we leave it out it will be easy to add.

I'm just on the fence.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CycloneDX/specification/issues/48#issuecomment-823581393, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHDLM6MCL4AZ6BT2W3RY42DTJXQGBANCNFSM4YNNI3DA.

-- Disclaimer: The information and attachments transmitted by this e-mail are proprietary to Velentium, LLC and the information and attachments may be confidential and legally protected under applicable law and are intended for use only by the individual or entity to whom it was addressed. If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message and attachments is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and delete this message from your system immediately hereafter.

stevespringett commented 3 years ago

Sounds like this is a 'no' vote from Chris.

stevespringett commented 3 years ago

Did anyone else in the IWG vote 'no' to this? I cannot recall...

christophergates commented 3 years ago

yes... it is a "no" vote 😃

Christopher Gates


Director of Product Security

www.velentium.com

(805)750-0171

520 Courtney Way Suite 110

Lafayette CO. 80026

(GMT-6)

Our new book is now shipping:

Medical Device Cybersecurity for Engineers and Manufacturers

U.S. https://us.artechhouse.com/Medical-Device-Cybersecurity-A-Guide-for-Engineers-and-Manufacturers-P2128.aspx | Worldwide https://uk.artechhouse.com/Medical-Device-Cybersecurity-A-Guide-for-Engineers-and-Manufacturers-P2073.aspx

Amazon https://www.amazon.com/Medical-Device-Cybersecurity-Engineers-Manufacturers/dp/1630818151/ref=sr_1_1?dchild=1&keywords=Axel+Wirth&qid=1592335625&sr=8-1 & Digital https://us.artechhouse.com/Medical-Device-Cybersecurity-for-Engineers-and-Manufacturers-P2174.aspx

Security Book Of The Year! https://engineering.tapad.com/the-best-information-security-books-of-2020-e7430444fbd4

“If everyone is thinking alike, then somebody isn't thinking.” -George S. Patton

"Facts are stubborn things." -John Adams, 1770

------ Original Message ------ From: "Steve Springett" @.> To: "CycloneDX/specification" @.> Cc: "Christopher Gates" @.>; "Mention" @.> Sent: 4/20/2021 2:42:19 PM Subject: Re: [CycloneDX/specification] RFC: known unknowns (#48)

Sounds like this is a 'no' vote from Chris.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CycloneDX/specification/issues/48#issuecomment-823587247, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHDLM6L764BTU5SEYYX3LGLTJXRKXANCNFSM4YNNI3DA.

-- Disclaimer: The information and attachments transmitted by this e-mail are proprietary to Velentium, LLC and the information and attachments may be confidential and legally protected under applicable law and are intended for use only by the individual or entity to whom it was addressed. If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message and attachments is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and delete this message from your system immediately hereafter.

coderpatros commented 3 years ago

@JNHQ was against it.

But I don't think there is any compelling support for it. So I vote to drop it too.

stevespringett commented 3 years ago

Done. Redacted has been removed from the 1.3 spec