coreinfrastructure / best-practices-badge

🏆Open Source Security Foundation (OpenSSF) Best Practices Badge (formerly Core Infrastructure Initiative (CII) Best Practices Badge)
https://www.bestpractices.dev
MIT License
1.2k stars 203 forks source link

gold per-file requirements hurt sustainability and deter contributors, why "MUST"? #2046

Open ljharb opened 1 year ago

ljharb commented 1 year ago

Having boilerplate/frontmatter in every file is an annoyance to contributors. It's an additional CI check that needs to be run, it's additional friction for people creating new files (but especially newcomers), and it does nothing to increase the security of the project itself. It certainly would make auditing easier for folks who depend on projects that have copy-pasted files from my project - but that's not related to my project's security.

At the very least, I think these should be downgraded to "SHOULD" - but my personal preference would be to invert them and say that one should actively NOT do this. Duplicating information that's already available in the right place at the root of the repo is just noise.

david-a-wheeler commented 1 year ago

I believe you mean the gold criteria [copyright_per_file] and [license_per_file].

This is a proposed change (reduction) in requirements, so we need to hear from others. If anyone has comments for or against this proposed change, please say so in this issue!

For simplicity, here is more information about each criterion.

Criterion copyright_per_file:

Criterion license_per_file:

kwwall commented 1 year ago

I would be in favor of changing this from MUST to SHOULD, but perhaps we could compromise somewhat and leave it as 'MUST' if a particular source file was "sourced" from another project as you mention in the rationale (or add an appropriate one if not present for that file). I think that would especially be important when the license type of that particular "borrowed" source file is different than the license type that you are releasing your project under. For instance, a project releasing under LGPL 2.1 license but pulling in a Java class that was licensed under (say) Apache 2 license. During all the code reviews that I've done in the past 10 years, I certainly have seen teams "borrow" something like the source code for org.apache.commons.lang3.StringUtils and pull that directly into their repo rather than including a dependency for Apache Commons Lang 3.

The disadvantage of copyrighting every source file is when you need to change it because the company or organization name changes, it becomes tedious to update it everywhere. I'm somewhat facing that now, because for OWASP ESAPI, most (if not all) of our Java source files have a copyright notice for "Open Web Application Security Project" and then OWASP had to go and change their name to "Open Worldwide Application Security Project". More of an annoyance, but it still illustrates the point.

ljharb commented 1 year ago

That makes total sense for vendored code, it's just exceedingly rare to ever do that in the ecosystems I participate in.

bagder commented 1 year ago

I always thought of this requirement as a way to prove to everyone that the provinance is in order. We know the copyright and license status for every individual file. And I think a gold project should live up to that.

But complying with REUSE does allow for also providing the same info out of file when necessary and as long as that info exists, I think it's fine. That is how we in curl comply with this.

TonyLHansen commented 1 year ago

I've seen too many files copied piecemeal from one project to another. IMHO, NOT having the info there is asking for trouble.

I'd even go so far as to suggest making it a SHOULD for SILVER, but leaving it a MUST for GOLD.

ljharb commented 1 year ago

Trouble for who? The point of this program is to make the project itself more secure.

tniessen commented 1 year ago

Being required to add a copyright notice to every single source file seems odd to me. As far as I am aware, copyright notices have little to no effect in most jurisdictions, at least in recent decades. The contents of a single source file might not even meet the threshold of originality in many jurisdictions.

Of course, if projects wish to use copyright notices in every single file as a deterrent against infringement, that's perfectly fine, but I don't see why it would be mandated for security reasons here.

If this recommendation is supposed to ensure that the origin and license of each file is clear, I am pretty sure someone could come up with a criterion that causes less friction.

kwwall commented 1 year ago

If we're going to require Copyright notice on every source file, can we at least make an exclusion for configuration files? That gets confusing because they often get heavily edited by library users.

david-a-wheeler commented 4 months ago

@kwwall - The requirements are only for "source files". Usually configuration files aren't source files, so typical configuration files are already excluded from this requirement.

kestewart commented 4 months ago

With multiple files being copied from project to project as a common development pattern, the license information is key to retain to ease analysis. And if there are problems with a license for a file contents being included with other files, only the copyright holder can change the license, so keeping this original metadata with the file, helps de-risk issues with using the software.

As vulnerabilities occur at the file level, having this information handy, helps with notification to the copyright holder, who may have used the contents of this file, in other locations as well.

Agree with David that configuration files and other generated evidence, do not typically have copyright asserted on the contents.

ljharb commented 4 months ago

@kestewart In which ecosystems is this a common and "non-discouraged" pattern?

david-a-wheeler commented 4 months ago

Agree with David that configuration files and other generated evidence, do not typically have copyright asserted on the contents.

A minor nit: The gold badge currently only requires per-file license statements for source files. Configuration files are often not generated evidence, but since they also aren't normally source files, there's no gold requirement for per-file license statements in typical configuration files (unless they're also source files). I am NOT a lawyer, but my understanding is that it's often pointless to try to claim copyright on configuration files. Copyright law only covers expression. This makes claiming copyright over configuration files often dubious (depending on the circumstance). See this discussion: https://groups.drupal.org/node/17555. No one here is suggesting that license statements be required in every configuration file, but I thought it'd be worth clarifying that there are good reasons for that. You can do it if you want to of course :-).

kwwall commented 4 months ago

Configuration file vs source file can get a bit fuzzy at times when it concerns the whole "infrastructure as code" paradigm, but as long as the gold badge standard doesn't get too draconian and leave those choices to the development teams, I think it will work out fine.