google / osv-scanner

Vulnerability scanner written in Go which uses the data provided by https://osv.dev
https://google.github.io/osv-scanner/
Apache License 2.0
6k stars 334 forks source link

Setting/overwriting license for a dependency #814

Open ststroppel opened 4 months ago

ststroppel commented 4 months ago

Hi, I've been exploring the new experimental license scanning feature and find it quite impressive. However, I believe there's a crucial functionality that seems to be missing - the ability to specify the license for a dependency.

During my testing of this feature in our npm repository, I encountered several dependencies where the license information could not be detected due to various reasons:

Therefore, I propose the addition of a feature that allows users to manually set or overwrite the detected license of a dependency, optionally for a specific version or a version range.

oliverchang commented 4 months ago

CC @josieang

arkodg commented 2 months ago

also interested in adopting the license scanning feature in Envoy Gateway, and this issue is stopping us from adopting it at this moment, more in https://github.com/envoyproxy/gateway/issues/2917

oliverchang commented 2 months ago

Thanks for the feedback all!

What would be the most convenient way for you to set this information?

ststroppel commented 2 months ago

Thanks @oliverchang for coming back to this topic. For me, definitely a config file would be the most convenient way to set it. There are several advantages for this approach:

oliverchang commented 2 months ago

@josieang (who added the license scanning feature initially) would you have some time to help with implementing this? It would also be a great opportunity to look at moving the feature out of experimental.

josieang commented 2 months ago

Thanks everyone for feedback on the license scanning feature! Sorry for missing this thread. I'll have time to work on this in early May.

You may have already seen it, but we have a "How are licenses determined?" section at https://docs.deps.dev/faq which explains briefly our license provenance.

@ststroppel Just curious:

Additionally, we could specify some version range as the declared license of a package could change with major releases

Why only be cautious about license changes only when major version bumps happen? Why not track any license changes for any version bump?

ststroppel commented 2 months ago

@josieang

Why only be cautious about license changes only when major version bumps happen? Why not track any license changes for any version bump?

How do you want to track the license changes? I was rather thinking about a quite dumb overwrite of a components license. E.g. if the package is dual licensed or the resolved license is UNKNOWN, I would select the appropriate license. In any case it could happen that a license of a package changes. I would say this happens mostly for major releases, but this is up to the user of the tool to decide e.g. by optionally providing a version range. If you need to set the license for every minor update of a package, this is quite tedious work where I personally would accept the risk that I would not detect a license change of the package if it would be changed as part of a patch release.

But I'm open to better ideas.

shahar-h commented 2 months ago

@ststroppel license can also be changed on minor update, e.g. terraform. In theory it can also be changed on patch update, although I don't think it's a real case. See related discussion: https://github.com/semver/semver/issues/322. Anyway, I think that users should be able to configure license override for either major/minor/patch versions of a given dependency.

josieang commented 2 months ago

Hm, that's an interesting point giving users the option to override on major, minor or patch makes sense to me.

From experience at deps.dev many ecosystems don't fully follow semver and parsing versions properly can be pretty ecosystem-specific.

It's likely guided remediation has needed to figure out how to parse versions, I will chat to @michaelkedar when he gets back from leave. But as a draft PR I'll do something pretty simple like spliting on ..

G-Rath commented 2 months ago

@josieang I might be able to help with that too if you like - I had to understand how each versions work in each ecosystem as part of creating semantic.

That package will probably already have the info you need in order to do the comparisons, it just might not expose them (yet)

oliverchang commented 1 month ago

@shahar-h #949 seems related to this -- Could the same configuration mechanism be used to handle edge cases like multiple-license vs choice-of-license?

shahar-h commented 1 month ago

@shahar-h #949 seems related to this -- Could the same configuration mechanism be used to handle edge cases like multiple-license vs choice-of-license?

I guess that you should add an override configuration that chooses the desired license in these cases.