ashinn / chibi-scheme

Official chibi-scheme repository
Other
1.21k stars 141 forks source link

read-line doesn't recognise \r as end of line #873

Open Oxyd opened 1 year ago

Oxyd commented 1 year ago
(import (scheme base) (scheme read) (scheme write) (scheme file))

(write (read-line (open-input-file "foo.txt")))

Then:

% echo "foo\rbar" >foo.txt
% ./tools/chibi-run testcase.scm
"foo\rbar"

R7RS says about read-line, in part: “For the purpose of this procedure, an end of line consists of either a linefeed character, a carriage return character, or a sequence of a carriage return character followed by a linefeed character.” Which means that read-line is required to recognise \r as end of line, which means the correct output is just "foo".

This is on Linux. I suspect the behaviour might be different on different platforms – which it shouldn't be.

ashinn commented 1 year ago

This is not platform dependent. Chibi reads until \n, then discards any trailing \r. I can't recall why we chose to support a lone \r - pre-BSD MacOS doesn't seem worth supporting. I doubt Chibi can even compile on it.

Oxyd commented 1 year ago

I agree that recognising lone \r is a strange choice. Nevertheless, I just tested with Kawa, CHICKEN and Gauche and they all correctly recognise \r as end of line, so Chibi is an outlier here.

APIPLM commented 1 year ago

One question is that in REPL (read-line (open-input-file "foo.txt")) and (read-line (open-input-string "foo\rbar")) .there are different output. The first one is "foo\rbar" , and second one is "foo".

APIPLM commented 1 year ago

The content of the foo.txt file is foo\rbar

APIPLM commented 1 year ago

Yes. it is that same result in REPL in Chicken. But the only different is that (read-line (open-input-file "foo.txt")) in Chicken. \rcharacter be recognised , and need to escape this character. so that output is "foo\\rbar"

APIPLM commented 1 year ago

In MIT/GNU Scheme, it is different, (read-line (open-input-file "foo.txt"))and (read-line (open-input-string "foo\rbar")), the output is same. which is;Value: "foo\rbar"

lassik commented 1 year ago

Adding a survey to https://github.com/schemedoc/surveys/tree/master/surveys would be appreciated.

APIPLM commented 1 year ago

Yes. But seem like it has one file md for this survey, and it does not have too much content in the file. I will raise the issue there.

ashinn commented 1 year ago

read-line on FILE* backed ports is optimized to use fgets, whereas the pure Scheme version does handle \r as you expect.

APIPLM commented 1 year ago

Thanks. It is the commit https://github.com/ashinn/chibi-scheme/commit/6615a746096274e0f6cdf27912563599ed613c49.

But somehow, I feel like thatread-line is kind of contradiction in the context. I mean that running(read-line (open-input-file "foo.txt")) and(read-line (open-input-string "foo\rbar")) in the REPL. In (open-input-file "foo.txt"), The reader read byte by byte and in (open-input-string "foo\rbar") The reader read character by character.

APIPLM commented 1 year ago

One more point, running the below lines in the REPL.

(read (open-input-string "foo\ee")) The output is fooee (read (open-input-string "foo\\ee")) The output is fooee (read (open-input-string "foo\\\ee")) The output is fooee (read (open-input-string "foo\\\\ee")) The output is |foo\ee|

lassik commented 1 year ago

Those are correct (except that |foo\ee| is |foo\\ee|):

> 'fooee
fooee
> 'foo\ee
fooee
> 'foo\ee
fooee
> 'foo\\ee
|foo\\ee|
> (read (open-input-string "foo\\\\ee"))
|foo\\ee|
lassik commented 1 year ago

Note that "\e" is the same as "e" in Chibi, so the \ is not seen by read at all.