SoftwareDesignLab / SBOM-in-a-Box

SBOM-in-a-Box is a unified platform to promote the production, consumption, and utilization of Software Bills of Materials.
MIT License
9 stars 0 forks source link

SPDX Deserializer Parses CDX XML #126

Closed dlg1206 closed 11 months ago

dlg1206 commented 1 year ago

Overview

SPDX deserializer parsers a CDX XML file and most likely a CDX JSON as well. There need to be preventive measures to stop misuse

Test file: CDXMavenPlugin_build_cdx.xml

Resources

Acceptance Criteria

ian1dunn commented 1 year ago

Does this use the SerializerFactory to construct an automatically assigned deserializer? The method SerializerFactory.createDeserializer() should handle this, although it may also be causing the problem since we don't have a 100% accurate way to detect schema & format based on file contents alone.

Excerpt from createDeserializer()

/**
 * Create a Deserializer for an SBOM file by auto-detecting its schema and format using its contents.
 *
 * @param fileContents The contents of the SBOM file to deserialize.
 * @return A Deserializer to deserialize the SBOM file.
 * @throws IllegalArgumentException If a schema and/or format cannot be determined.
 */
public static Deserializer createDeserializer(String fileContents) throws IllegalArgumentException {
    // TODO figure out a 100% correct way of determining file schema and format, this was my quick and dirty soln

    // Defaults to CDX14JSONDeserializer
    Schema schema = CDX14;
    Format format = JSON;

    if (fileContents.contains("SPDX")) schema = SPDX23;
    else if (fileContents.contains("rootComponent")) schema = SVIP; // Field unique to SVIP SBOM

    if (fileContents.contains("DocumentName:")) format = TAGVALUE;

    // TODO what if we still have an incorrect deserializer?
    return schema.getDeserializer(format);
}
dlg1206 commented 1 year ago

This fixed the issue, but I will leave it open for now until we can figure out a more stable solution

dlg1206 commented 12 months ago

Note: Direction to use file extension was previously used but discarded since can't assume the file extension matches the content, see: https://github.com/SoftwareDesignLab/plugfest-tooling/issues/90