joncampbell123 / dosbox-x

DOSBox-X fork of the DOSBox project
GNU General Public License v2.0
2.66k stars 378 forks source link

DOS CON device check: CR LF aka \r \n behavior. #2536

Closed joncampbell123 closed 3 years ago

joncampbell123 commented 3 years ago

Describe the bug Verify what MS-DOS actually does with this ANSI sequence.

https://twitter.com/bastetfurry/status/1394661643561447425

Firefox_Screenshot_2021-05-18T14-33-23 985Z

bastetfurry commented 3 years ago

Still investigating this, it could be that Dosbox has another problem with the console. From my little game i am working on i issue a cputs("\n");, works fine in Linux and Dosbox. I'll come back to you when i tested that in a second.

bastetfurry commented 3 years ago

Yep, that was the problem. Not the ESC[K.

#ifdef __DOS__
            cputs("\n\r");
#elif __LINUX__
            cputs("\n");
#endif

I am using Open Watcom 2 and the conio Lib by the way.

joncampbell123 commented 3 years ago

Ah, yes.

MS-DOS and Windows use CR LF to start a new line. A CR will only put the cursor back to the left on the same line, a LF will just move the cursor down one line. This is also true of the console window in Windows as far as I know.

You normally want to use \r\n since that is the standard line ending in DOS text files, though \n\r happens to work fine for the console.

The Linux terminal is usually in a mode where a simple newline is sufficient (see the termios functions about that). A \n alone is also the standard line ending in Linux and other non MS-DOS systems. It's also the reason why NOTEPAD.EXE will not properly display text files and source code from a Linux system, because it expects the CR LF sequence and will not break the line otherwise.

The interesting thing is that for formatted output like printf() the C runtime will automatically convert \n to \r \n for you.

joncampbell123 commented 3 years ago

CR = \r LF = \n

http://www.asciitable.com/

joncampbell123 commented 3 years ago

From what I've seen in the past there are three common line endings:

CR LF (0x0D 0x0A) = MS-DOS and Windows LF (0x0A) = Linux, Unix, Mac OS X CR (0x0D) = Very old Mac OS (System 6 from the 1980s old)

Fortunately these days you're only going to see CR LF or LF, don't worry about the Mac OS one. :)

bastetfurry commented 3 years ago

The interesting thing is that for formatted output like printf() the C runtime will automatically convert \n to \r \n for you.

So OW2's conio.h, which should be faster for text mode games, won't do the conversion for one. Evil trap. And Dosbox happily accepting a single \n wasn't helping either here. :sweat_smile:

joncampbell123 commented 3 years ago

The interesting thing is that for formatted output like printf() the C runtime will automatically convert \n to \r \n for you.

So OW2's conio.h, which should be faster for text mode games, won't do the conversion for one. Evil trap. And Dosbox happily accepting a single \n wasn't helping either here. sweat_smile

The faster method usually forfeits the nice conversions and formatting, of course.

joncampbell123 commented 3 years ago

Looking at Open Watcom source code, cputs() just calls putch() which uses INT 21h AH=6 direct console output. So, no \n to \r\n conversion.

http://www.ctyme.com/intr/rb-2558.htm

joncampbell123 commented 3 years ago

By the way, for reasons possibly related to CR LF vs LF formatting, and code portability, MS-DOS C runtime libraries have another evil trap.

If you use functions like open/close/read/write/lseek, the C runtime by default will convert LF to CR LF on write() and CR LF to LF on read(). It's an evil trap if you're writing code to handle binary data on MS-DOS, because you'll wonder why the hell the binary data is getting corrupted. You turn that off by passing O_BINARY to the open() function.

That includes Open Watcom C, not just Microsoft C.

I think Microsoft's rationale was that it made compiling Unix code on MS-DOS easier since Unix code was written to expect only \n when parsing text and the \r\n tended to cause problems. So unless you indicate that it's binary data, the C runtime assumes you're processing text and does the conversion.

joncampbell123 commented 3 years ago

Looking at the CON driver, it uses INT 10h, which in turn processes \r and \n.

\r is supposed to bring the cursor to column 1 (all the way to the left) and \n is supposed to move one line down. It appears the code already does this, in fact there is a comment there from someone who noticed that treating \n as one command to move down and to the left breaks an old chess game. So DOSBox-X is already emulating \r and \n properly.