Closed MattHeffron closed 2 months ago
I think there is a simple, kludgy, brute-force fix to this.
OPENSTRINGSTREAM converts a thin-string (as in your example) to fat, so that every character occupies 2 bytes. So replacing (\GETFILEPTR..) by (FOLDLO (\GETFILEPTR..) 2) should give the specified value.
If commonlisp had said that any of the file-reading functions should also return the character position of the first unread character, that would be much worse (utf-8, xccs…).
Separately, the spec for CL:READ-FROM-STRING is a little strange, saying "If the entire string was read, the position returned is either the length of the string or one greater than the length of the string.” Which is it?
On Aug 24, 2024, at 10:32 PM, Matt Heffron @.***> wrote:
Describe the bug The position value returned from CL:READ-FROM-STRING is twice the value it should be. (It is returning the byte position, not the character position.)
To Reproduce Steps to reproduce the behavior:
Run full.sysout In the XCL Exec, enter: (multiple-value-list (read-from-string "ABCDEF X")) The value (ABCDEF 14) is returned and displayed Expected behavior In step 3, the value returned should be (ABCDEF 7). Note that the Common Lisp HyperSpec (CLHS) https://www.lispworks.com/documentation/HyperSpec/Body/f_rd_fro.htm page for read-from-string notes:
position---an integer greater than or equal to zero, and less than or equal to one more than the length of the string.
The 14 is not within that range. Also from the CLHS:
The secondary value, position, is the index of the first character in the bounded string that was not read.
Context (please complete the following information):
IL:MAKESYSDATE: 31-Jul-2024 02:24:38 **Other info The second returned value from read-from-string appears to be used in SEDIT-COMMANDS in extract-current-selection. I don't know why this isn't affected by this.
— Reply to this email directly, view it on GitHub https://github.com/Interlisp/medley/issues/1812, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSTUJJ5JEBULLJ7GQQGRWDZTFUALAVCNFSM6AAAAABNCHFA7OVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4DKMBWHA2TKNI. You are receiving this because you are subscribed to this thread.
On 25 Aug 2024, at 17:40, rmkaplan @.***> wrote:
Separately, the spec for CL:READ-FROM-STRING is a little strange, saying "If the entire string was read, the position returned is either the length of the string or one greater than the length of the string.” Which is it?
I think the spec talks about this: there can (it claims) be cases where you (might) want to simulate an extra character at the end of the string and the thing is then allowed to return the index it would if that character was there.
That strikes me as a ludicrously poor argument: even for an implementation that does this (I presume there was one) then instead of making the implementation make at worst a call to MIN with one of the arguments being the string length, every program has to be careful. That's C-level design.
--tim
The second returned value from
read-from-string
appears to be used in SEDIT-COMMANDS inextract-current-selection
. I don't know why this isn't affected by this.
SEdit didn't have an issue with this because read-from-string
was using the start
value with the same interpretation as byte position, not character position. This is corrected in commit 07e858d of PR #1833.
Describe the bug The
position
value returned fromCL:READ-FROM-STRING
is twice the value it should be. (It is returning the byte position, not the character position.)To Reproduce Steps to reproduce the behavior:
full.sysout
(multiple-value-list (read-from-string "ABCDEF X"))
(ABCDEF 14)
is returned and displayedExpected behavior In step 3, the value returned should be
(ABCDEF 7)
. Note that the Common Lisp HyperSpec (CLHS) page forread-from-string
notes:The
14
is not within that range. Also from the CLHS:Context (please complete the following information):
Other info The second returned value from
read-from-string
appears to be used in SEDIT-COMMANDS** inextract-current-selection
. I don't know why this isn't affected by this.