Consider making LF (over CRLF) line endings the default for consistency against other operating systems

xPaw commented 3 years ago

Environment

Item	Value
OS, Version / Build	10.0.19042.0 (any applies)
Processor Architecture	Any
Processor Type & Model	___
Memory	___
Storage Type, free / capacity (e.g. C: SSD 128GB / 512GB)	___
Relevant apps installed	Visual Studio, Windows Terminal, PowerShell, etc.

Description

Historically, Windows is odd one out by using CRLF over LF. For developers this is especially annoying, for example with autocrlf option in git.

Even Notepad was updated to correctly handle different line endings.

See https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration#_core_autocrlf

With Microsoft pushing for developer ecosystem on Windows, I think this might a good idea to revist these artifacts of history and perhaps update the defaults, or at least provide some kind of option for developers to set.

https://twitter.com/thexpaw/status/1361263999468384257

Steps to reproduce

Save a file in Visual Studio or Notepad, it will use CRLF line endings.

Expected behavior

New files saved should use LF for line endings.

Actual behavior

bitcrazed commented 3 years ago

Closing as wontfix for the same reasons articulated in #82, copied below for completeness:

There isn't some magical function in Windows that parses/emits CRLF as a line ending - this behavior is hard-coded into hundreds of millions of apps, tools, scripts, etc. many of which will break if CRLF is replaced with LF, again, likely crashing millions of systems, businesses, and even several nations.

As an aside, CRLF is actually something Microsoft got right. CR moves the cursor all the way to the left, LR moves the cursor down one line. Abbreviating CRLF to LF is a lossy *NIXism.

justanotheranonymoususer commented 3 years ago

this behavior is hard-coded into hundreds of millions of apps, tools, scripts, etc.

You can start by changing yours (Microsoft). It can be opt-in just like long paths or system-wide UTF-8 is opt-in. It's a legit suggestion that I don't think should be closed.

As an aside, CRLF is actually something Microsoft got right. CR moves the cursor all the way to the left, LR moves the cursor down one line. Abbreviating CRLF to LF is a lossy *NIXism.

What year is it? It might have been right when we were babies.

bitcrazed commented 3 years ago

So we change every Windows app that saves text to files to emit LF instead of CRLF and break many THOUSANDS (if not millions) of running systems that expect to see CRLF. This helps how?

orcmid commented 3 years ago

This all goes back to the way terminals once worked (like typewriters) and also how Ascii displays worked too.

It was also the default interpretation of the CR and LF control codes in original ASCII and ISO 646-1973. For ISO 646, there was accommodation of the treatment of LF as NL (newline) by agreement among the senders and recipients. There is also (in my copy of 646-1973 that I am looking at) consideration that the combined use (i.e., NL instead of CR LF) "may be restricted for international transmission on general switched telecommunication networks."

In Unicode, LF is the definitive designation, and it is indicated that it can be used as newline (NL) and also end-of-line (EOL).

So, this is a lot like big-endian, little-endian squabbling and I'm with Rich Turner on this. This is not the sort of thing that is paramount for reconsideration. It is just too late. (Rich: I think there was a time on MS-DOS that CR was emitted as NL in some apps, and that doesn't matter either -- poor what-about-ism and I am not offering it for that.)

The accommodation that repositories make to have file transfer work in the face of different preferences of the client is about as good as it gets. There are many apps that will treat a single LF as the same as CR LF, and will treat '\n' in text strings appropriately on a given client. It is nice that DIFFs can be indifferent to this also.

Now we can focus on the Unicode BOM and when it is used correctly? (not entirely sarcastic)

For the record, I recall at the Windows NT announcement event, that Alpha was going to be one of the platforms. That did not happen and for good reason. In multibyte binary values, one was little-endian (Intel) and the other was big-endian and it was a deal-breaker.

justanotheranonymoususer commented 3 years ago

and break many THOUSANDS (if not millions) of running systems that expect to see CRLF. This helps how?

I'd love to do that on my computer, and if some breaks, report to the authors :) I do understand the realities, though

So, this is a lot like big-endian, little-endian squabbling

I don't quite agree. There's much more hassle. Say you're parsing a file. In addition to being able to parse \n vs \r\n, what happens if you get hello\n\rworld or hello\r\n\rworld? What happens if a part of the file has one line ending and another a different one? etc. etc. And I'm sure that it causes security issues as well, identified or not.

Endianness is fixed per OS and format, and, although a PITA as well, you know what to expect in advance.

Sadly, stuff such as HTTP and email adopted CRLF as well... It's a lost battle, that's true.

microsoft / Windows-Dev-Performance