oss-review-toolkit / ort

A suite of tools to automate software compliance checks.
https://oss-review-toolkit.org
Apache License 2.0
1.57k stars 308 forks source link

Reporter: spdx files missing data #6749

Closed qequ closed 1 year ago

qequ commented 1 year ago

Working on a repo I noted that a dependency listed in spdx has no license (concluded, declared) nor homepage

Screenshot from 2023-03-27 17-45-02

Checking the same dependency in the Web-App lists the data correctly

Screenshot from 2023-03-27 17-40-21

CC: @dgutson

dgutson commented 1 year ago

@sschuberth @fviernau spdx.homepage shouldn't be equivalent to web-app.ProcessedRepository?

sschuberth commented 1 year ago

spdx.homepage shouldn't be equivalent to web-app.ProcessedRepository?

No, it shouldn't. Please see the SPDX specs. A homepage URL is a homepage URL, and a repository URL is a repository URL. However, in lack of a dedicated homepage, some projects declare their repository URL as the homepage URL in their package metadata. But that's no interpretation / mapping that ORT does. We take the metadata as-is here.

dgutson commented 1 year ago

Then what about VCS location or some other VCS related field? Look, @qequ and I stayed until very late in the night manually fixing dozens of entries that looked like inconsistencies, I may understand that some metadata may be different or have some inconsistency, but I think that ORT could do some more coverage of grouping some more cases. In most of the casesbthe information was in the analysis results yaml and the webapp, and was not in the spdx, so I just copy pasted it.

dgutson commented 1 year ago

Because of the enormous effort we took and still internally sponsor ORT, would you mind to analyze this project and see what could had happened?

sschuberth commented 1 year ago

Then what about VCS location or some other VCS related field?

What exactly do you mean? Where URL to a VCS location is in SPDX output? It's basically like this: ORT has more metadata about a package than we can squeeze into a single SPDX package. That's why a single ORT package is usually split into three (!) SPDX packages: The binary package, the belonging VCS package ("-vcs" suffix in the id), and the belonging sources artifact package ("-source-artifact" in the id). The downloadLocation of the binary package points to the binary artifact, the downloadLocation of the belonging VCS package point to the VCS repository, and the downloadLocation of the source package points to the source artifact.

In most of the casesbthe information was in the analysis results yaml and the webapp, and was not in the spdx, so I just copy pasted it.

While I still believe that is most cases this is due to misunderstandings of the SPDX spec (which is horrible, BTW), I don't understand why don't instead adjust the SPDX reporter to your needs, and eventually contribute your changes back. That should scale much better.

sschuberth commented 1 year ago

Because of the enormous effort we took and still internally sponsor ORT, would you mind to analyze this project and see what could had happened?

While I appreciate your use of ORT, I hope you realize that over the past weeks we as a maintainer team were quickly jumping at almost every issue you filed, and helped you guys where we could without really knowing much about you or your use-cases, or getting anything back.

So if you now ask us to do your day-job work, maybe you should consider sponsoring ORT not internally, but externally 😉

sschuberth commented 1 year ago

After a short "debate" with @fviernau I decided to create https://github.com/oss-review-toolkit/ort/pull/6760, which might also help to avoid some confusion on your side, @dgutson.

dgutson commented 1 year ago

@sschuberth @fviernau thanks we appreciate all your quick response time and eagerness to help. On our side we currently provide tons of hours of testing effort (manual and automated scripts). We will swtch to maintenance mode soon, it's just that now we needed to come up with the SPDX asap. Thanks again for your support. @arieltorti stayed 24hs nonstop for example identifying failing patterns and grouping them (and developing external workarounds).

mnonnenmacher commented 1 year ago

@dgutson As you are working a lot with ORT currently you might be interested in joining the ORT Community Meeting.

sschuberth commented 1 year ago

a dependency listed in spdx has no license (concluded, declared) nor homepage

As the homepage bit has been clarified and the declared license is not cleared anymore I believe we're good to close this.