NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.19k stars 5.83k forks source link

SBOM is missing PURL's for some jars #4197

Closed ryanmkurtz closed 9 months ago

ryanmkurtz commented 2 years ago

The bom.json file (attached below) that Ghidra generates as part of its 10.2-DEV build is missing PURL's for some jars. This is because the jars do not have a pom.xml file embedded in them from which to extract a group string. Some alternative techniques that were tried include parsing their MANIFEST.MF file, but this yielded unreliable results due to non-standard or optional property names.

Some of these group names can be retrieved via Gradle API calls, but only if they were declared as external dependencies from maven central. However, Gradle doesn't have that info for jars such as dexlib, which gets downloaded via the fetchDependencies.gradle script.

Without a proper PURL, tools like Dependency-Track cannot properly report vulnerabilities and other issues with external dependencies.

Attachment: bom.json.zip

ryanmkurtz commented 2 years ago

Any ideas @pombredanne?

pombredanne commented 2 years ago

@ryanmkurtz Let me check this in details. Are these JARS otherwise public and published on Maven central or some public Maven repo of sorts?

pombredanne commented 2 years ago

Checking dexlib-1.4.0.jar for instance: is this is not something that exists anywhere anymore?

Does not look like the same as https://repo1.maven.org/maven2/org/smali/dexlib2/ ... Closest would be https://github.com/JesusFreke/smali/tree/v1.4.0/dexlib ?

Based on this I would give it a Maven group of "org.jf" and name "dexlib" e.g., pkg:maven/org.jf/dexlib ... though it could have been and "org.smali" group and I wonder where you got it from? It doe not seem to be in any public release at https://bitbucket.org/JesusFreke/smali/downloads/

That said it is named in your build "org.smali" in https://github.com/NationalSecurityAgency/ghidra/blob/0241b2b97ebc1356850186796153b0e5f509f96e/Ghidra/Features/FileFormats/build.gradle#L38 so it would be pkg:maven/org.smali/dexlib@1.4.0 IMHO

But since it cannot be found anywhere, you would be best served by publishing the jara somewhere (with sources) and adding a download_url=... qualifier to your purl

ryanmkurtz commented 2 years ago

We have a fetchDependencies.gradle script that gets all of the non-maven jars. We get the dex jars from https://github.com/pxb1988/dex2jar/releases/download/2.0/dex-tools-2.0.zip.

pombredanne commented 2 years ago

We get the dex jars from https://github.com/pxb1988/dex2jar/releases/download/2.0/dex-tools-2.0.zip.

This zip does not contain dexlib-1.4.0.jar anywhere I look the only place is the source repo mentioned above or Ghidra binaries: https://github.com/search?l=&p=1&q=filename%3Adexlib-1.4.0.jar+fork%3Atrue&ref=advsearch&type=Code

pombredanne commented 2 years ago

One possible explanation is that you fetched dexlib from jcenter who went dark last year and you still have a local cached copy on your build system.

ryanmkurtz commented 2 years ago

Ah yes, I was thinking of the wrong jar. We get it from maven: https://mvnrepository.com/artifact/org.smali/dexlib/1.4.0

ryanmkurtz commented 2 years ago

Sorry, that was a bad example jar for this ticket.

ryanmkurtz commented 2 years ago

The general issue though is how to form a PURL for jars that are embedded in a zip file, that don't contain an embedded POM file.

pombredanne commented 2 years ago

Ah yes, I was thinking of the wrong jar. We get it from maven: https://mvnrepository.com/artifact/org.smali/dexlib/1.4.0

This is no longer on any public repo listed there or anywhere I can find. It was last on https://dl.bintray.com/gost/smali/org/smali/dexlib/1.4.0/ and jcenter that went dark last year. See some other discussions: https://github.com/shazam/fork/issues/93 and https://github.com/wala/WALA/issues/125

The general issue though is how to form a PURL for jars that are embedded in a zip file, that don't contain an embedded POM file.

The presence of a POM file is not what I think would be the most important (this is at best a recent convention in the Java world to include these in the JARs and since it is commonly missing the pom.properties and parent poms it is often a weak data source). Rather the point is to craft a purl that can reliably be used to get the code, and then be a decent identifier to reference for possible vulnerabilities and in general software composition and SBOMs.

For this case you have at least one JAR that may have been published at some point in the past and seems to have vanished from the face of the earth in its prebuilt binary form... only left in source form.

I would go either:

pombredanne commented 9 months ago

@ryanmkurtz Can you elaborate on what the resolution was? I could not find an obvious thing in the commit stream.

ryanmkurtz commented 9 months ago

@pombredanne Oops, I meant to close it as "not planned", so there is nothing in the commit stream for it. I closed it because I don't have a good solution for it. That is, i don't have an automated way to generate the missing purls. Rather than keep it open forever, i decided to close it as not planned.