s-expressionists / Eclector

A portable Common Lisp reader that is highly customizable, can recover from errors and can return concrete syntax trees
https://s-expressionists.github.io/Eclector/
BSD 2-Clause "Simplified" License
109 stars 9 forks source link

read-time GENSYM call tests to fail #78

Closed Yehouda closed 1 year ago

Yehouda commented 1 year ago

Some of the tests work by reading the same source file by cl:READ and by eclector.reader:READ and comparing the results. That will fail if the source file contains a read time evaluation that produce different results in diffrent calls.

The source files do contain such evaluation, because they have load-time calls to GENSYM., for exmaple https://github.com/s-expressionists/Eclector/blob/32b74ad959bd5b2773f27ecc6e62c2b167bf6626/code/base/read-char.lisp#L16 that causes the comparisons to fail.

One possibility is not to use gensym this way. The other one to make **equal* treat uninterned symbols specially, by making the atom-test** test treat uninterned symbols specially. The latter is less work, the former more correct. https://github.com/s-expressionists/Eclector/blob/32b74ad959bd5b2773f27ecc6e62c2b167bf6626/test/utilities.lisp#L6

scymtym commented 1 year ago

Thank you for the report.

I'm not sure how the read time-evaluated expressions involving gensym can produce different results. The only test which compares read objects between the host reader and the Eclector reader is read-code/host-equivalence. However, this test binds *gensym-counter* to 0 around the read calls, so I would expect corresponding gensym calls to produce identical (string=) symbol names. Can you explain in more detail what is going wrong?

Yehouda commented 1 year ago

I don't think you can rely on the genesym-counter to be the same, because any code that calls GENSYM will cause it to increment, and you don;t know which other code is calling GENSYM and how many times.

The actual failure is a result of the reading of #= in Lispworks using GENSYM internally, which changes the count. In the example below,. when it reaches the #.(gensym), gensym-counter already incremented by reading #1=, so it reads #:g1, while the eclector reader still got 0. I don't think that LispWorks is wrong, there is nothing that tells you who calls GENSYM and where.


CL-USER 127 > (compile (defun call-reader (reader) (with-input-from-string (ff "(#1=6 #.(gensym))") (let ((*gensym-counter* 0)) (funcall reader ff)))))
(compile (defun call-reader (reader) (with-input-from-string (ff "(#1=6 #.(gensym))") (let ((*gensym-counter* 0)) (funcall reader ff)))))
CALL-READER
NIL
NIL

CL-USER 128 > (CALL-READER 'read)
(CALL-READER 'read)
(6 #:G1)

CL-USER 129 > (CALL-READER 'ECLECTOR.reader::read)
(CALL-READER 'ECLECTOR.reader::read)
(6 #:G0)
scymtym commented 1 year ago

The actual failure is a result of the reading of #= in Lispworks using GENSYM internally, which changes the count.

This is the information I was missing, thanks.

I'm probably going to replace the read-time uses of gensym.

Yehouda commented 1 year ago

make-symbol if an obvious replacement, or write your own version of gensym,

          (defvar *my-gensym-counter* 0)
         (defun my-gensym (string) (make-symbol (format nil "\~a\~d" string (incf *my-gensym-counter*)))