oss-review-toolkit / ort

A suite of tools to automate software compliance checks.
https://oss-review-toolkit.org
Apache License 2.0
1.56k stars 306 forks source link

package-curations: Allow adding arbitrary tags to packages #8725

Open fviernau opened 3 months ago

fviernau commented 3 months ago

Users may need to have custom data per package in order to make use of it customizable places. For example, in:

  1. Policy rules
  2. Plain text templates
  3. Pass per package data to some specific reporter, advisor, while avoiding adding a dedicated property

Proposal

  1. Add the property val tags: List<String> or alternatively val tags: Map<String, String> to Package
  2. Make the property set-able via package curations.

Example use cases

  1. A user hosts source code for its dependencies on own infrastructure. Links to the source code shall be inserted into a NOTICE file generated using a custom template for the plain text template reporter. The curation can specify the link, and the template can consume and insert it.
  2. A user wants to inject further custom text per package into the NOTICE file
  3. In context of developing a custom advisor plugin, the advisor queries seem to need manual fixes in some cases. These can be injected with this mechanism.
fviernau commented 3 months ago

@oss-review-toolkit/core-devs what do you think about this?

sschuberth commented 3 months ago

custom data per package

Could you provide some concrete examples, and how that data would be used e.g. in rules or reporters?

fviernau commented 3 months ago

Could you provide some concrete examples, and how that data would be used e.g. in rules or reporters?

I've added some to the issue description.

tsteenbe commented 3 months ago

I like the outset of idea but I wouldn't call it tags but labels so it consist with labels (-l) one can tag a ORT result. Also thing we should make some concrete example user stories so we can more easily get feedback from the larger ORT community.

mnonnenmacher commented 3 months ago

I like the outset of idea but I wouldn't call it tags but labels so it consist with labels (-l) one can tag a ORT result. Also thing we should make some concrete example user stories so we can more easily get feedback from the larger ORT community.

Key value pairs seem to better align with the example use cases and in this case I agree that labels would be the more consistent naming.

@oss-review-toolkit/core-devs what do you think about this?

I think that is an interesting idea, but I'd like to understand if there is an actual requirement for that feature, and if it cannot be solved with the existing functionality.

fviernau commented 3 months ago

I think that is an interesting idea, but I'd like to understand if there is an actual requirement for that feature, and if it cannot be solved with the existing functionality.

Actual requirements of mine are as per examples 1,2 and maybe 3 in description. So, yes there is.

sschuberth commented 3 months ago

Some comprehension examples regarding the example use-cases:

A user hosts source code for its dependencies on own infrastructure. Links to the source code shall be inserted into a NOTICE file

Why are the links to the source code no part of regular package metadata, which can already be accessed by the reporter?

A user wants to inject further custom text per package into the NOTICE file

I assume that text is no static, otherwise it would be trivial. But is that text really do complex / package-dependent that it cannot be templatized?

the advisor queries seem to need manual fixes in some cases.

Can you share some details about the nature of those fixes? In what way are they package-dependent?

fviernau commented 3 months ago

Why are the links to the source code no part of regular package metadata, which can already be accessed by the reporter?

I suspect a misunderstanding. If you use an open source package X in your proprietary software, which has the license obligation for you to make source code of package X available, and you decide to link the source code in your NOTICE file and furthermore to decide to not rely on linking to the upstream project, but to the source code hosted on your own infrastructure. Given that, the link to own infrastructure does simply not belong to the package X which is why it is not part of that metadata.

fviernau commented 3 months ago

Why are the links to the source code no part of regular package metadata, which can already be accessed by the reporter?

I suspect a misunderstanding. If you use an open source package X in your proprietary software, which has the license obligation for you to make source code of package X available, and you decide to link the source code in your NOTICE file and furthermore to decide to not rely on linking to the upstream project, but to the source code hosted on your own infrastructure. Given that, the link to own infrastructure does simply not belong to the package X which is why it is not part of that metadata.

Can you share some details about the nature of those fixes? In what way are they package-dependent?

Retrieving vulnerabilities involves matching version strings. This is similar to our Git tag matching. It can fail in case matching is ambiguous. In that case it needs to be manually fixed by specifying what to use.

I assume that text is no static, otherwise it would be trivial.

What would be a trivial solution for static text ?

fviernau commented 3 months ago

In general, this feature would add almost zero complexity to ORT because ORT would not interpret the values. Similar to the OrtResult labels. On the other hand it provides a lot of flexibility, and can be useful. Is this not sufficient, to add it?

Let's say you develop some kind of ORT plugin which you don't contribute upstream. That plugin require some additional package attribute. So, you can just use that mechanism without introducing the attribute to ORT. Basically, it's just one additional option we have in future.

sschuberth commented 3 months ago

I assume that text is no static, otherwise it would be trivial.

What would be a trivial solution for static text ?

You could just add that static text to a custom reporter template itself.

Is this not sufficient, to add it?

I don't believe so. If we could fulfill all requirements already with existing means, even adding little complexity could be too much, unless it offers significant convenience over the existing means.

So that's all I'm after: Ensuring that what you want to achieve cannot already be achieved in ways you might not have thought of.

fviernau commented 3 months ago

I don't believe so. If we could fulfill all requirements already with existing means, even adding little complexity could be too much, unless it offers significant convenience over the existing means.

It does not add complexity at all, because the fields are not interpreted by ORT. Anyhow, basically it allows for using a single consistent mechanism (package-curations) to associate arbitrary attributes with identifiers. Otherwise one has to re-implement injecting Map<Identifier, ?> instance into the places where the data is needed, or hard-code such maps, e.g. inside the template. So, it's a new way of injecting things which does not require adding any further logic to ORT, e.g. a new parameter.