CycloneDX / cyclonedx-rust-cargo

Creates CycloneDX Software Bill of Materials (SBOM) from Rust (Cargo) projects
https://cyclonedx.org/
Apache License 2.0
96 stars 43 forks source link

XML deserialization of empty tags is incorrect #738

Closed jcreekmore closed 1 month ago

jcreekmore commented 2 months ago

As a working example, I am going to reference the following area of code, but I believe it is possible to run into this in any area that the library is parsing an Option<String> field: https://github.com/CycloneDX/cyclonedx-rust-cargo/blob/2911287b2520a7ddab1782b48c35112279b1be17/cyclonedx-bom/src/specs/common/component.rs#L513-L517

Setting context, I was attempting to swap out an ad-hoc CycloneDX parser with cyclonedx-bom. One of the tests for my ad-hoc parser was to ensure that the keycloak BOM found here loads: https://github.com/CycloneDX/bom-examples/blob/c0436d86cd60693f01d19fe1aacfd01e70e17036/SBOM/keycloak-10.0.2/bom.xml. Now, it obviously doesn't because it is a v1.2 spec and cyclonedx-bom only supports v1.3-v1.5. But, looking through the spec and looking at the BOM, it seemed like it should load as a v1.3, so I changed the version locally. That seemed a reasonable test given that a diff between https://github.com/CycloneDX/bom-examples/blob/c0436d86cd60693f01d19fe1aacfd01e70e17036/SBOM/laravel-7.12.0/bom.1.2.xml and https://github.com/CycloneDX/bom-examples/blob/c0436d86cd60693f01d19fe1aacfd01e70e17036/SBOM/laravel-7.12.0/bom.1.3.xml only added some properties and changed the spec version.

However, when I attempted to load the modified keycloak BOM, I got the following error:

    Got unexpected XML element when reading {http://cyclonedx.org/schema/bom/1.3}description: Got unexpected element EndElement({http://cyclonedx.org/schema/bom/1.3}description)

After digging into the issue, what I discovered was that the keycloak BOM has several empty description tags in it:

<description />

A quick modification of the laravel-7.12.0/bom.1.3.xml file to modify one of the description tags the same way caused the same error to occur. I also attempted to just modify one of the description like the following and received the same error:

<description></description>

The v1.3 (XML) and v1.2 (XML) specs both define the description field as <bom:description> xs:normalizedString </bom:description> [0..1] and an xs:normalizedString can be empty.

Again, there are multiple places just in the Component::read_xml_element where this could be an issue. My guess is that, since the pom.xml file that is used to generate the CycloneDX Maven BOMs can have empty strings for the description, that is why it is getting forwarded on to the BOM that way instead of stripping out the description tag completely.

I could make an argument for handling it either by making an empty string become a None for the Option<String> OR by literally making it an Some(String::new()). I think that the latter is closer to the behavior that would happen with the JSON implementation of the spec, since an equivalent there would be:

"description": ""

and that would be parsed as Some(String::new()) by serde.

Shnatsel commented 1 month ago

Thank you for reporting this!

Treating this as Some(String::new()) for consistency with JSON sounds good to me. I'm not familiar with XML parsing myself, but I'd be happy to merge a PR with this change.

justahero commented 1 month ago

Hello,

I asked on the CycloneDX Slack to clarify how to handle the description tag. It's not totally clear to me if <description /> is a valid entry or if it needs to be either absent or contain text. I'm uncertain, because the XML sample files in the specification repository don't contain any empty <description /> tag. We used these sample files as test data for this repository, but they may not cover all cases.

justahero commented 1 month ago

Update: The answer on Slack was the <specification /> tag is valid, it should result in an empty string. The current XML parse logic is therefore wrong, at least for these types of fields.