s-expressionists / Eclector

A portable Common Lisp reader that is highly customizable, can recover from errors and can return concrete syntax trees
https://s-expressionists.github.io/Eclector/
BSD 2-Clause "Simplified" License
109 stars 9 forks source link

Single quote bug in parse-result #69

Open ruricolist opened 4 years ago

ruricolist commented 4 years ago

When using parse-result, the symbol quote does not appear as a child of the expression when using the single-quote reader macro.

Minimal example:

(defclass trivial-parse-result (eclector.parse-result:parse-result-client) ())

(defmethod eclector.parse-result:make-expression-result
    ((client trivial-parse-result) result children source)
  (list :result result :children children))

If you write out quote you get one result:

(eclector.parse-result:read-from-string (make-instance 'trivial-parse-result)
                                        "(quote nil)")
=> ((:RESULT 'NIL :CHILDREN
    ((:RESULT QUOTE :CHILDREN NIL) (:RESULT NIL :CHILDREN NIL))))

But if you use a single quote you get a different result:

(eclector.parse-result:read-from-string (make-instance 'trivial-parse-result)
                                        "'nil")
=> (:RESULT 'NIL :CHILDREN ((:RESULT NIL :CHILDREN NIL)))
scymtym commented 4 years ago

Eclector constructs a concrete syntax tree (CST) - nodes of the tree correspond to parts of the input (which part of the input a particular result corresponds to is indicated by the source parameter of make-expression-result).

For (quote nil), the symbol quote is part of the input and thus represented as a node in the CST.

For 'nil, the symbol quote is added by the macro function for ' and does not correspond to a part of the input. It is therefore not represented as a node of the CST.

Maybe I'm not understanding your report correctly, but at the moment, I don't see the bug.

ruricolist commented 4 years ago

My impression from the documentation was that the parse-tree package is intended to be lower-level than CSTs and to leave the question of what is constructed to the client.

It is at least very surprising that 'nil and (nil) have the same children but (quote nil) does not.

So if I understand it correctly now, "children" is not children in a tree-ish sense, but precisely and only the result of recursive reads?

scymtym commented 4 years ago

My impression from the documentation was that the parse-tree package is intended to be lower-level than CSTs and to leave the question of what is constructed to the client.

Yes, it should be up to the client, though in my understanding CSTs are the lower-level concept (compared to ASTs). The result protocol is intended to not tie Eclector to one particular result representation. concrete-syntax-tree, the library, is just one way to make a client and implement the protocol.

It is at least very surprising that 'nil and (nil) have the same children but (quote nil) does not.

So if I understand it correctly now, "children" is not children in a tree-ish sense, but precisely and only the result of recursive reads?

Currently, the "tree" is formed by recursive read calls, yes. However, custom reader macros and even just e.g. #+ make this less clear cut since results can be rearranged, augmented or discarded. For example, using your client, the result tree in

(eclector.parse-result:read-from-string (make-instance 'trivial-parse-result) "#-(and) 1 2")
(:RESULT 2 :CHILDREN
 ((:RESULT (:AND) :CHILDREN ((:RESULT :AND :CHILDREN NIL)))
  (:RESULT NIL :CHILDREN NIL)))

does not contain anything representing 1 (The nil result is a long-standing bug I plan to address later). Defining

(defmethod eclector.parse-result:make-skipped-input-result
    ((client trivial-parse-result) stream reason source)
  (list :skipped :stream stream :reason reason :source source))

changes the result to

(:RESULT 2 :CHILDREN
 ((:RESULT (:AND) :CHILDREN ((:RESULT :AND :CHILDREN NIL)))
  (:SKIPPED :STREAM #<SB-IMPL::STRING-INPUT-STREAM {1008EB2933}> :REASON
   *READ-SUPPRESS* :SOURCE (8 . 9))
  (:RESULT NIL :CHILDREN NIL)))
11
((:SKIPPED :STREAM #<SB-IMPL::STRING-INPUT-STREAM {1008EB2933}> :REASON
  (:SHARPSIGN-MINUS :AND) :SOURCE (0 . 10)))

i.e. one additional child for the skipped 1 and one "orphan" result for the skipped #-(and 1).

I'm open to (and actually plan to) changing the rules for composing the results into a tree, but I don't see a good way to add additional results that do not correspond to recursive read calls. It would probably be possible to add a method on wrap-in-quote for result-producing clients which adds this particular result, but that seems like a hack and it would make portable user-defined reader macros second-class since they could not do the same.

ruricolist commented 4 years ago

Thank you for the explanation. I no longer think the current behavior is wrong, just surprising, and it might be worth dwelling on at greater length in the documentation.

scymtym commented 4 years ago

I will try to improve the documentation, maybe after settling on a way of composing skipped and "proper" results.