Perl-Toolchain-Gang / Software-License

perl representation of common software licenses
18 stars 40 forks source link

Trim very many much superfluous white-space. Wow. #30

Closed kentfredric closed 3 years ago

kentfredric commented 10 years ago

Closes the specific case I noticed in #29 . I figured if there was one instance of this, there might be more, so went and ripped it out everywhere I found it.

I traced the git history of the files in question back through several names and didn't see any documented reason to keep the whitespace present.

dolmen commented 9 years ago

I think that whitespace should be preserved as in the canonical version of each license file. Because it would be helpful to be able to compare just the checksum of a LICENSE file (SHA1, MD5...) to a known database to quickly identify the license of a project.

For example,

perl -MSoftware::License::LGPL_3_0 -E 'print Software::License::LGPL_3_0->license'

should give the exact same output as https://www.gnu.org/licenses/lgpl.txt

I just checked LGPL_3_0 and this property is not preserved by this patch: there is a few spaces difference on the first line.

We should also have some author/release tests that would check this property is preserved by comparing to a canonical online file (just in case some whitespaces in the canonical file are fixed).

Cc: @karenetheridge

dagolden commented 9 years ago

There's no guarantee that the upstream license won't change. If we want a checksum, that should live in Software::License.

Personally, I don't see a lot of value in this. On Apr 3, 2015 8:27 AM, "Olivier Mengué" notifications@github.com wrote:

I think that whitespace should be preserved as in the canonical version of each license file. Because it would be helpful to be able to compare just the checksum of a LICENSE file (SHA1, MD5...) to a known database to quickly identify the license of a project.

For example,

perl -MSoftware::License::LGPL_3_0 -E 'print Software::License::LGPL_3_0->license'

should give the exact same output as https://www.gnu.org/licenses/lgpl.txt

I just checked LGPL_3_0 and this property is not preserved: there is a few spaces difference on the first line.

We should also have some author/release tests that would check this property is preserved by comparing to a canonical online file (just in case some whitespaces in the canonical file are fixed).

Cc: @karenetheridge https://github.com/karenetheridge

— Reply to this email directly or view it on GitHub https://github.com/Perl-Toolchain-Gang/Software-License/pull/30#issuecomment-89273387 .

ribasushi commented 9 years ago

I agree that there is little value in attempting to do checksums. Moreover the entire "file per license" approach is not a universal way of doing things, e.g.: https://metacpan.org/source/RIBASUSHI/DBIx-Class-0.082820/LICENSE

kentfredric commented 9 years ago

I would also suggest that a lot of people may be copy-pasting licenses outside CPAN, and losing whitespace there also ( some people even use whitespace strippers as part of their git commit hooks, or part of the editor they may have pasted it into ), so the utility of the checksums being portable seems rather minimal.

I'd probably argue any legal proceeding involving a license would probably entail presenting the published document, printed on paper, before said courts, making the presence or absence of said white-space entirely irrelevant.

And even then, even if the whitespace was presented using some special presentation, what legal concern is oriented around the presence of whitespace in a legal text being significant enough to render the text invalid, when the message of the text itself in the words is unchanged.

I'd argue that even re-flowing and re-wrapping the text is unlikely to adjust its legal interpretation.

( And re-wrapping and reflowing may actually have occurred in some sources using GPL licenses )

Also, some HTML based UA rendering such a license may be likely to:

For example, viewing https://api.metacpan.org/source/RIBASUSHI/DBIx-Class-0.082820/LICENSE with your window constrained changes the format of the license as far as the viewer is concerned.

But as to whether my beliefs are supported by GPL authors and LEGAL people, is of course, a different question.

Have filed off a request.

Hah. Seems GNU use RT ... :D

ghost commented 9 years ago

I don't think anyone's arguing that whitespace, especially "superfluous" whitespace, has any legal consequences at all.

Checksumming's not the only way to detect changes to the language in a license, and it could just as easily be done whitespace-blind, but I agree there's really no reason at all to strip this particular whitespace off: it's not causing any trouble and it causes our file to differ (in a small way, admittedly) from the source file for no real benefit.

Maybe try to make the fix upstream, but I'm not sure whether they'll be any more receptive.

karenetheridge commented 9 years ago

there's really no reason at all to strip this particular whitespace off

Maybe not for you, but since multiple people have made this type of PR, clearly there are reasons for them.

The reason why I want to strip it is to avoid triggering whitespace checks in my distribution (not a test, but a manual grep).

kentfredric commented 9 years ago

it's not causing any trouble

It does when you apply a "strip all the whitespace before I commit this set of code" ( because I have a habit of accidentally adding whitespace to various files ), and happen to strip the whitespace from your LICENSE file in the process, creating commit diff[1].... and then the next time you run "build", having that file recreated again with the whitespace back, showing up as a diff ...

Basically end up playing ping-pong with git/license generator if I'm not focusing.

1: And by that I mean, once upon a time I did this, and I stopped doing this because of the LICENSE file fighting me at every step. So now I just code stuff and hope I notice the diff before I commit it.

miyagawa commented 9 years ago

Well if this change is merged and released, next time I build every single one of my 100 CPAN modules, I'll need to commit the LICENSE file diff to make the git status clean, and I'd swear every time I see it.

If you ever commit an auto-generated file, then leave the file as is. Otherwise do not commit.

Or maybe write a dzil plugin that strips whitespaces from LICENSE file before committing them, however silly that might sound.

karenetheridge commented 9 years ago

Well if this change is merged and released, next time I build every single one of my 100 CPAN modules, I'll need to commit the LICENSE file diff to make the git status clean, and I'd swear every time I see it.

This would happen to you whenever the license text changes for other reasons as well (e.g. the GPL address changed last year, which we've seen propagating through the diffs of every distribution released since then).

@miyagawa I would suggest copying the LICENSE into your repository not after every build, but simply after each release, and then commit the result -- I do this in my pluginbundle with [CopyFileFromRelease] and [Git::Commit]. (FWIW I find it very annoying building in distributions where files change after every build, because there's always some slight churn in cpanfile, Makefile, META.* etc with prerequisites.)

miyagawa commented 9 years ago

This would happen to you whenever the license text changes for other reasons as well (e.g. the GPL address changed last year, which we've seen propagating through the diffs of every distribution released since then).

Yes, that's a legitimate change, so I can actually review the updated license. This whitespace change is not.

ghost commented 9 years ago

It does when you apply a "strip all the whitespace before I commit this set of code" ( because I have a habit of accidentally adding whitespace to various files )

I used to do that kind of stuff too - but I got bit a few times, and then I realized that immediately before committing is not the right time to have automatic stuff happen to my files. Now I have my editor strip/normalize whitespace appropriately (upon save) for the specific type of file, and/or the specific project (which may have different settings depending on who founded the project), and I'm much happier.

kentfredric commented 9 years ago

and/or the specific project

That is possibly a key element. I had vim set up to strip EOLs on save on Perl files. But I disabled that at some stage for some reason.

( Probably due to having to touch other peoples repo's and fighting with "great, now I edited your code, I now have to make a git commit/patch without all the whitespace changes" )

kentfredric commented 9 years ago

Response from FSF License team:

Yes, it is perfectly fine to modify the whitespace from FSF licenses. Similarly, it is permissible change the line wrapping and to visually reformat the license to a format appropriate to the medium. The text of the license must remain intact and obfuscated.

The license is meant for humans to read and does not need to by identical to the original bit-by-bit or character-by-character. What is crucial is that the license remain readily recognizable and clearly legible.

I hope this is of help.

dolmen commented 9 years ago

So, what about merging? I say :+1:

kenahoo commented 9 years ago

I'm still not convinced it's a good change - how do we actually keep track of when an upstream license changes? Maybe that would factor in. But I still think people should figure out how to keep their dev tools from screwing with what are essentially data files.

miyagawa commented 9 years ago

Consider me -0 on this one. Not strongly disagreed, but not in favor of it either.

shlomif commented 9 years ago

For the record (and I should note that I'm not a core contributor to this package), I heartily support merging this pull-request. Death to trailing whitespace! :+1:

kenahoo commented 9 years ago

I'd still vote against it. Those are data files, not code files.

karenetheridge commented 9 years ago

I'd still vote against it. Those are data files, not code files.

So? The data is plaintext, and text can have superfluous content removed.

kentfredric commented 8 years ago

If in the event we can't manage to agree on whether or not we strip this trailing whitespace, if there's enough interest, I'll attempt a patch that instead adds whitespace stripping support at the emitter level instead of in every licenses source.

At least that way it defers the "does lots of stuff change" decision to whatever is consuming Software::License, not Software::License itself.

But I'd rather the files be just trimmed, because its simpler.

karenetheridge commented 8 years ago

But I'd rather the files be just trimmed, because its simpler.

me too!!

(assuming we can go ahead..) If there's a documented process for updating the licence source files, then "strip whitespace before committing" should be added to that. Even better if this was in a script so it Just Happened Automatically.

kenahoo commented 8 years ago

I'm going to bow out of the conversation. I still don't think it's a good idea but I don't think there's anything new being said.

Leont commented 8 years ago

I'm -0 on this too. It seems rather silly to me.

demerphq commented 8 years ago

On 4 Apr 2015 02:30, "Kent Fredric" notifications@github.com wrote:

it's not causing any trouble

It does when you apply a "strip all the whitespace before I commit this set of code" ( because I have a habit of accidentally adding whitespace to various files ),

I have script I hacked together for dealing with this. It uses git blame to find out which lines have been modified then does whitespace fixes to only those lines.

I will put it up somewhere if you think it would be useful.

Yves

rjbs commented 8 years ago

I vote no. It will cause a bunch of downstream noise, and the value seems to be "I am automatically trimming files and that makes these change." Solution: don't do that to these files.

karenetheridge commented 8 years ago

Here's a few compromise options:

neilb commented 8 years ago

I agree with @karenetheridge here:

  1. close this ticket
  2. as each licence requires some other change, trim the whitespace then
  3. if someone wants to open issues on the source documents, fine, but I suspect they'll reply as for point 2 above. Ie they'll not do a change just to trim the whitespace, and I suspect not at all.