fsfe / reuse-tool

reuse is a tool for compliance with the REUSE recommendations.
https://reuse.software
399 stars 148 forks source link

Idea: REUSE materialize to annotate codebased on reuse.toml #921

Open nicorikken opened 8 months ago

nicorikken commented 8 months ago

Idea: An option to take the copyright and license information from reuse.toml and apply it to an existing codebase.

You would run:

$ reuse materialize

And all files would be annotated with headers or .license files with the information provided in the reuse.toml files.

Reason

There are some use-cases where annotating a codebase directly is not ideal because the code needs to be kept up to date or because it would result in many .license files cluttering the codebase. The .reuse/dep5 file and future reuse.toml cover these use-cases by proving a designated location to place annotations. Still this method is less explicit than annotating all files and should be considered a last resort.

Use-cases

Annotate for release

This Materialize option would enable projects to explicitly annotate a codebases when packaging for release. This way they can distribute code in a way that is REUSE-able in the best way possible.

Annotate declaratively

If implemented it would also allow users to use a reuse.toml file to decleratively annotate codebases that have more complex copyright and license information: first describe the copyright and license information in the reuse.toml file and then annotate files accordingly using Materialize.

Use as preprocessor for license scanner

Not all software license scanners properly detect copyright and license information of codebases considered REUSE compliant, because they lack support for the .license files, let alone the .reuse/dep5 or reuse.toml file. With this feature those tools could use the Materialize option as a preprocessor for REUSE-compliant projects. Note those tools would still have to support the .license files.

What if we don't implement it?

Codebases can still be REUSE-compliaint. Users can manually reuse annotate files according to the reuse.toml. They can even create a patch set to repeatedly apply on codebases that shouldn't be touched.

Risks of feature creep

When users start using the reuse.toml file as a base to annotate, they might want to have more control mechanisms in place, like how the files should be annotated (comment style, modify files or add .license file).

mxmehl commented 8 months ago

I like the idea but I wonder about the input. Wouldn't it make more sense to use an SBOM on the input side? This could increase the use-cases where an organisation could export an SBOM for a project and run the materialize command to turn it into REUSE compliance?

silverhook commented 8 months ago

I see pros and cons of relying on a (full) SBOM for this.

pro:

contra:

(playing devil’s advocate a bit, don’t be mad)

Ultimately, I think the SBOM as input idea is also good, but perhaps in addition to the reuse.toml input idea.

mxmehl commented 8 months ago

You have some point there, definitely.

I do wonder, also with regard to the risk of feature creep, whether standalone scripts (for ingesting reuse.toml and SBOMs) would be the better way. The reuse.toml script would probably depend on the output of reuse lint --json, while the SBOM "importers" would probably need to ask for preferences on the license field(s).

Another reason why I somehow dislike the idea of putting the command into the reuse core is that it promotes a non-recommended practice (putting everything in reuse.toml). That said, I appreciate the general idea even if I currently wouldn't need it in my daily life.

silverhook commented 8 months ago

How about importing from .ABOUT files?