r-lib / desc

Manipulate DESCRIPTION files
https://desc.r-lib.org/
Other
121 stars 23 forks source link

Whitespace noise #14

Closed lionel- closed 7 years ago

lionel- commented 8 years ago

Hello,

The write() method adds unwanted whitespace, which creates commit noise.

description-whitespace

This happens to all fields whose first element is on a new line.

gaborcsardi commented 8 years ago

Thanks, yes, this is bad.

gaborcsardi commented 8 years ago

Actually, I remember this. It is a bug in base::write.dcf It adds the space even if you tell it to keep the formatting. I was waiting to see if they want to fix it, but I guess I can work around it.

gaborcsardi commented 8 years ago

Also, strictly speaking, the space must be there, from Writing R extensions:

Fields start with an ASCII name immediately followed by a colon: the value starts after the colon and a space.

But read.dcf is fine without the space, so I guess if you already have it without the space, I can leave it like that.

krlmlr commented 8 years ago

Are there any packages on CRAN without space between colon and newline? If yes, I guess we can remove that space for now, and postpone worrying about that space until CRAN starts caring about it.

gaborcsardi commented 8 years ago

@krlmlr I think what happens is that R CMD build rewrites DESCRIPTION, and adds the space, so CRAN never actually sees this. R CMD build (read.dcf, really) itself is fine with the version without the space.

For example this: https://github.com/hadley/ggplot2/blob/59c503b8e1cacf1f9264d1e233b7a305916905d6/DESCRIPTION is reformatted as this: https://github.com/cran/ggplot2/blob/f5b44da50369016c92e3b9e31f85b4e1b30f739b/DESCRIPTION

I think the correct solution would be to add / keep the spaces, because that's what the format requires.

Trailing spaces in git commits are annoying, but you don't change these fields in DESCRIPTION often.

krlmlr commented 8 years ago

I can't find anything about the requirement for a space after a colon in read.dcf() docs nor in https://www.debian.org/doc/debian-policy/ch-controlfields.html . Instead, I find (emphasis mine):

Horizontal whitespace (spaces and tabs) may occur immediately before or after the value and is ignored there; it is conventional to put a single space after the colon.

gaborcsardi commented 8 years ago

Yes, but 'Writing R extensions` has it, see above.

krlmlr commented 8 years ago

This might be a simple omission, perhaps authors didn't consider this use case. In fact, read.dcf() also works if the value starts right after the colon. How about requesting clarification and then asking for s/a space/optional spaces or tabs/ at the appropriate part of R-exts?

gaborcsardi commented 8 years ago

I think it is pretty clear:

the value starts after the colon and a space

I know, it accepts it without a space, but that just means that the parser is forgiving. The definition of the format is clear imo.

Anyway, I really do not mind, so if devtools removes the spaces, I can remove them, too. No big deal, really.

gaborcsardi commented 8 years ago

I'll skip this for now, as it is not trivial to keep the state without the space, and the standard seems to require the space there.

hadley commented 7 years ago

One reason to reconsider this is that RStudio will strip trailing horizontal whitespace :/

hadley commented 7 years ago

Oh but it looks like I can get DESCRIPTION put on the list of files where trailing horizontal whitespace is significant.

kevinushey commented 7 years ago

It's possible that R-exts meant to say any space character; e.g. newlines or tabs following the colon would all be considered as acceptable. (I'm guessing the regex parser just checks for :[:space:] or something to that effect)

kevinushey commented 7 years ago

Regardless, I think I agree that RStudio should avoid stripping trailing whitespace here just to avoid noisy diffs.

lionel- commented 7 years ago

Apparently Debian folks think it's conventional to have this trailing whitespace. In any case it would be nice if all tools (desc, devtools) did the same thing. Could devtools depend on desc @hadley?

gaborcsardi commented 7 years ago

In principle I could not care less about that space and whether it is there or not. :)

But read.dcf ignores it if it is there, and write.dcf just puts it there, always. AFAIR.

So after parsing with read.dcf I don't know if it was there or not. I would need to have my own parser, or some pre-processing tool to see if there was a space there or not.

@kevinushey the stripping is especially annoying because write.dcf writes it back, every time. :)

So if RStudio can avoid the stripping, then how about the following?

gaborcsardi commented 7 years ago

I'll write the pre-processing code, it is actually quite easy,

I wonder what the hell I was thinking here. And also why I didn't write the code 24 days ago...... :)

lionel- commented 7 years ago

Thanks!

gaborcsardi commented 7 years ago

@hadley I put a new version of desc on CRAN yesterday, so roxygen 6 will be gentle on DESCRIPTION.

@kevinushey it would be great if RStudio could keep the trailing whitespace in DESCRIPTION. Thanks!