cisco / ChezScheme

Chez Scheme
Apache License 2.0
6.95k stars 982 forks source link

Confusing difference between '(a b c) and (list 'a 'b 'c) [question] #674

Closed LambdusCadadrus closed 1 year ago

LambdusCadadrus commented 1 year ago

Hello! I am learning Scheme with TSPL running Chez Scheme 9.5.8 on Linux. All the code below is entered into an otherwise empty file and run with #!/usr/bin/scheme --script

Learning about lists, I ran into confusing behavior.

(let ((x (list 'a 'b 'c)))
    (display (list? x))     ; #t
    (set-cdr! (cddr x) x)
    (display (list? x)))    ; #f

The above works as expected, displaying #t#f. x is bound to a proper list and after set-cdr! it becomes a cyclical list which is not a proper list anymore so list? returns #f.

But now if I run the following:

(let ((x '(a b c)))
    (display (list? x))    ; #t
    (set-cdr! (cddr x) x)
    (display (list? x)))   ; #t - why?

the output is #t#t. I cannot explain the second #t. I would expect the same output as before but now somehow after making x a cyclical list, list? still returns #t.

Below is my somewhat limited attempt to gain more understanding:

(let ((x '(a b c))
      (y (list 'a 'b 'c)))
    (display (equal? x y))    ; #t
    (set-cdr! (cddr x) x)
    (set-cdr! (cddr y) y)
    (display (equal? x x))    ; #t
    (display (equal? y y))    ; #t
    (display (equal? x y)))   ; #f - not sure about this

The output now is #t#t#t#f. The initial lists are equal? as expected because they have the same structure and the same elements. But after making both of them cyclical the same way I would expect them to still be equal? which is not the case. Displaying the modified lists gives the same infinite output for each as I would expect. Same structure, same elements, same behavior.

What is the difference between using '(a b c) and (list 'a 'b 'c) in this context? Is this specific to Chez Scheme, expected in all compliant Scheme implementations, or am I hitting unspecified behavior without realizing it? What approach can I take to "debug" such differences in the future? Maybe a way to inspect memory layout or cons structures? Is there something in the book, R6RS or Chez User's Guide that I missed or did not understand?

Thank you!

LambdusCadadrus commented 1 year ago

There is the book's Section 6.1:

Quoted and self-evaluating constants are immutable. That is, programs should not alter a constant via set-car!, string-set!, etc., and implementations are permitted to raise an exception with condition type &assertion if such an alteration is attempted. If an attempt to alter an immutable object is undetected, the behavior of the program is unspecified.

There is also issue #149

I suppose these should serve as an explanation.

I would like to understand this better however, especially the difference between the two ways of creating a list. How and why are the two lists actually different? Especially when bound to a local let variable? This still doesn't make complete sense. My first intuition is that there must be a difference between:

(set-car! '(a b c) 5)  ; I see an immutable quoted constant.

(let ((x '(a b c)))
    (set-car! x 5))  ; I have doubts that x is an immutable quoted constant.

(define myproc
    (lambda (x)
        (set-car! x 5))) ; How do I know if x is an immutable quoted constant?
(myproc '(a b c))

I might be drifting further away from being Chez-specific and I apologize for that. I would appreciate any insight.

akeep commented 1 year ago

Essentially what is going on here is that Chez Scheme is treating '(a b c) as a constant in the optimizer, since it is specified as constant in the R6RS specification (as you noted from Section 6.1).

Since it is an immutable, compile-time constant, Chez will preform constant folding and constant propagation. You can see this, if you take your example:

(let ((x '(a b c))
      (y (list 'a 'b 'c)))
    (display (equal? x y))    ; #t
    (set-cdr! (cddr x) x)
    (set-cdr! (cddr y) y)
    (display (equal? x x))    ; #t
    (display (equal? y y))    ; #t
    (display (equal? x y)))   ; #f - not sure about this

And run it through Chez Scheme's expand/optimize function, you can see how the '(a b c) has been inlined, equality has been judged on it and, in part because of some of the inlining,

(let ([y (#2%list 'a 'b 'c)])
  (#2%display (#2%equal? '(a b c) y))
  (#2%set-cdr! '(c) '(a b c))
  (#2%set-cdr! (#2%cddr y) y)
  (#2%display #t)
  (#2%display #t)
  (#2%display (#2%equal? '(a b c) y)))
LambdusCadadrus commented 1 year ago

@akeep this is great, thank you very much! Your explanation definitely makes the situation more transparent. expand/optimize looks like something I was looking for here as well.

R6RS 5.10. Storage model seems to also give me another hint:

... An object fetched from a location, by a variable reference or by a procedure such as car, vector-ref, or string-ref, is equivalent in the sense of eqv? (section 11.5) to the object last stored in the location before the fetch. ... An attempt to store a new value into a location referred to by an im- mutable object should raise an exception with condition type &assertion.

This resolves my confusion about the let to a degree. Acquiring an intuition about Scheme's memory model is definitely an adjustment after C.

I am curious now about the design decision to not raise an exception - was there any specific reason to interpret the report's "should" in that direction?

olopierpa commented 1 year ago

On Sat, Dec 17, 2022 at 11:18 PM LambdusCadadrus @.***> wrote:

(define myproc (lambda (x) (set-car! x 5))) ; How do I know if x is an immutable quoted constant?

You should not modify any data structure unless you KNOW that it is modifiable. In this case, you must call this function only on a mutable pair.

(myproc '(a b c))

so this is an error.

I might be drifting further away from being Chez-specific and I apologize for that. I would appreciate any insight.

All this is not Chez specific, but in other implementations you may get different results in the erroneous cases, because in these cases the results are undefined.

akeep commented 1 year ago

Yeah @olopierpa has it exactly right, it is about representation. Both the constant list created as '(a b c) and the regular list created with (list 'a 'b 'c) have the same internal representation. set-car! and set-cdr! cannot, in general, determine if the pair it is operating over was created as a constant or a regular pair.

LambdusCadadrus commented 1 year ago

@akeep @olopierpa thank you again, very insightful.