phantomics / april

The APL programming language (a subset thereof) compiling to Common Lisp.
Apache License 2.0
598 stars 31 forks source link

Problem(s) with parentheses and user defined operators #279

Closed sjl closed 6 months ago

sjl commented 7 months ago

I'm new to APL and have been going through an APL tutorial with April. I ran into a snag when the tutorial got to the point of defining our own operators, and I'm not sure if it's a bug in April or my own understanding.

First I define an operator and a couple of helper functions:

(april-f "⍝ Apply f if and only if g ⍵ is true.
          ⍝ In lisp terms: (if (g x) (f x) x)
          _if_ ← { (⍺⍺⍣(⍵⍵ ⍵))⍵ }
          Pos ← { ⍵ > 0 }
          Inc ← { 1 + ⍵ }")

That works as expected:

(april-f "Inc _if_ Pos 0") ; => 0
(april-f "Inc _if_ Pos 100") ; => 101
(april-f "Inc _if_ Pos ¯100") ; => ¯100

I can parenthesize the operator without issue:

(april-f "(Inc _if_ Pos) 100") ; => 101

Then I wanted to try using bind to avoid having to define helper functions. That works fine for the left side, and I can parenthesize it for clarity if I want:

(april-f "+∘1 _if_ Pos 100") ; => 101
(april-f "(+∘1) _if_ Pos 100") ; => 101

Using bind one the right like this without parentheses won't work, which makes sense:

(april-f "(+∘1) _if_ >∘0 100")
; =>
#<FUNCTION (LAMBDA (APRIL::OMEGA &OPTIONAL APRIL::ALPHA APRIL::ENVIRONMENT APRIL::BLANK)
             :IN
             APRIL::OPERATE-BESIDE) {10583691EB}>

So let's parenthesize the right:

(april-f "(+∘1) _if_ (>∘0) 100")
; =>
#<FUNCTION (LABELS APRIL::∇SELF :IN APRIL::∇OSELF) {105846156B}>

Hmm, that's not what I expected. Let's parenthesize the entire thing just to be explicit and safe (I'm a Lisp person, I love parens, it's fine):

(april-f "((+∘1) _if_ (>∘0)) 100")
; =>
(L.0)
NIL

That's also not what I expect. I tried to check whether this is just me not understanding something about APL by running these examples on https://tryapl.org and they seem to return what I expect:

      _if_ ← { (⍺⍺⍣(⍵⍵ ⍵))⍵ }
      Pos ← { ⍵ > 0 }
      Inc ← { 1 + ⍵ }

      Inc _if_ Pos 0
0
      Inc _if_ Pos 100
101
      Inc _if_ Pos ¯100
¯100
      (Inc _if_ Pos) 100
101

      +∘1 _if_ Pos 100
101
      (+∘1) _if_ Pos 100
101
      (+∘1) _if_ >∘0 100
  +∘ 1  ∇_if_ > ∘ 0 100
      (+∘1) _if_ (>∘0) 100
101
      ((+∘1) _if_ (>∘0)) 100
101
      ((+∘1) _if_ Pos) 100
101

I also get an error with each, but only when I parenthesize both the right side and the entire thing at the same time:

; Parens around entire thing are ok:
(april-f "((+∘1) _if_ Pos)¨   5 0 ¯5 9 ¯9") ; => #(6 0 -5 10 -9)

; Parens around the right side with each are fine:
(april-f "(+∘1) _if_ (>∘0)¨   5 0 ¯5 9 ¯9") ; => #(6 0 -5 10 -9)

; But can't do both?
(april-f "((+∘1) _if_ (>∘0))¨ 5 0 ¯5 9 ¯9")
; =>
No function found to the left of lateral operator ¨.
   [Condition of type SIMPLE-ERROR]

And again, all of those work as I expect at tryapl:

      ((+∘1) _if_ Pos)¨   5 0 ¯5 9 ¯9
6 0 ¯5 10 ¯9
      (+∘1) _if_ (>∘0)¨   5 0 ¯5 9 ¯9
6 0 ¯5 10 ¯9
      ((+∘1) _if_ (>∘0))¨ 5 0 ¯5 9 ¯9
6 0 ¯5 10 ¯9

And just as a bonus, I noticed when trying to debug this that including more than two of these problematic calls in a single april-f call gives me an even stranger error:

(april-f "(Inc _if_ (>∘0)) 1
          (Inc _if_ (>∘0)) 1")
; =>
(L.0)
NIL

(april-f "(Inc _if_ (>∘0)) 1
          (Inc _if_ (>∘0)) 1
          (Inc _if_ (>∘0)) 1")
; =>
Execution of a form compiled with errors.
Form:
  (TAGBODY NIL NIL)
Compile-time error:
  The tag NIL appears more than once in a tagbody.
   [Condition of type SB-INT:COMPILED-PROGRAM-ERROR]

Like I said, I'm new to APL, so apologies if any of this is me misunderstanding something and things are actually working as intended.

phantomics commented 7 months ago

Good job, you've found some bugs in the compiler. You can get some better understanding of what's going on by entering:

* (april-f (with (:compile-only)) "(+∘1) _if_ (>∘0) 100")
(IN-APRIL-WORKSPACE COMMON
  (LET ((OUTPUT-STREAM *STANDARD-OUTPUT*))
    (DECLARE (IGNORABLE OUTPUT-STREAM))
    (SYMBOL-MACROLET ((INDEX-ORIGIN ⊑*INDEX-ORIGIN*)
                      (PRINT-PRECISION ⊑*PRINT-PRECISION*)
                      (COMPARISON-TOLERANCE ⊑*COMPARISON-TOLERANCE*)
                      (DIVISION-METHOD ⊑*DIVISION-METHOD*)
                      (RNGS ⊑*RNGS*))
      (A-OUT
       (A-COMP :OP ⊑|_if_|
               (A-COMP ∘ OPERATE-BESIDE (SUB-LEX 1) (SUB-LEX (APL-FN-S +)))
               (A-CALL
                (A-COMP ∘ OPERATE-BESIDE (SUB-LEX 0)
                        (SUB-LEX (APL-FN-S > COMPARISON-TOLERANCE)))
                100))
       :PRINT-PRECISION PRINT-PRECISION :PRINT-TO OUTPUT-STREAM))))

The (:compile-only) option will show you the code April generates. In this case, it's treating the entire (>∘0) 100 expression as the right operand to _if_. This reflects a problem with the parser; it should understand that the right operand is a function because _if_ expresses the operand as ⍵⍵.

For your next example:

(april-f (with (:compile-only)) "((+∘1) _if_ (>∘0)) 100")
(IN-APRIL-WORKSPACE COMMON
  (LET ((OUTPUT-STREAM *STANDARD-OUTPUT*))
    (DECLARE (IGNORABLE OUTPUT-STREAM))
    (SYMBOL-MACROLET ((INDEX-ORIGIN ⊑*INDEX-ORIGIN*)
                      (PRINT-PRECISION ⊑*PRINT-PRECISION*)
                      (COMPARISON-TOLERANCE ⊑*COMPARISON-TOLERANCE*)
                      (DIVISION-METHOD ⊑*DIVISION-METHOD*)
                      (RNGS ⊑*RNGS*))
      (A-OUT NIL :PRINT-PRECISION PRINT-PRECISION :PRINT-TO OUTPUT-STREAM))))

You're running into another compiler bug that causes nothing at all to be output. Clearly something is getting dropped at some iteration of the parser. The (L.0) is April's way of representing nil. April uses CL lists internally to model namespaces, in the form of plists. You can also pass Lisp lists into April, although it can't do much with them, and their printed representation is (L.n) where n is the length of the lists. Hence a list with length 0 is nil.

phantomics commented 7 months ago

Next:

* (april-f (with (:compile-only)) "(+∘1) _if_ (>∘0)¨   5 0 ¯5 9 ¯9") ; => #(6 0 -5 10 -9)
(IN-APRIL-WORKSPACE COMMON
  (LET ((OUTPUT-STREAM *STANDARD-OUTPUT*))
    (DECLARE (IGNORABLE OUTPUT-STREAM))
    (SYMBOL-MACROLET ((INDEX-ORIGIN ⊑*INDEX-ORIGIN*)
                      (PRINT-PRECISION ⊑*PRINT-PRECISION*)
                      (COMPARISON-TOLERANCE ⊑*COMPARISON-TOLERANCE*)
                      (DIVISION-METHOD ⊑*DIVISION-METHOD*)
                      (RNGS ⊑*RNGS*))
      (A-OUT
       (A-CALL
        (A-COMP |¨| OPERATE-EACH
                (SUB-LEX
                 (A-COMP :OP ⊑|_if_|
                         (A-COMP ∘ OPERATE-BESIDE (SUB-LEX 1)
                                 (SUB-LEX (APL-FN-S +)))
                         (A-COMP ∘ OPERATE-BESIDE (SUB-LEX 0)
                                 (SUB-LEX
                                  (APL-FN-S > COMPARISON-TOLERANCE))))))
        (AVEC 5 0 -5 9 -9))
       :PRINT-PRECISION PRINT-PRECISION :PRINT-TO OUTPUT-STREAM))))

This output looks right. However:

* (april-f (with (:compile-only)) "((+∘1) _if_ (>∘0))¨ 5 0 ¯5 9 ¯9")
; Evaluation aborted on #<SIMPLE-ERROR "No function found to the left of lateral operator ~a." {1029B9CA73}>.

Something is going wrong during compilation. We can see what's happening upstream of the compiler by entering:

* (april-f (with (:print-tokens)) "((+∘1) _if_ (>∘0))¨ 5 0 ¯5 9 ¯9")
(-9 9 -5 0 5 (:OP :LATERAL #\DIAERESIS)
 ((0 (:OP :PIVOTAL #\RING_OPERATOR) (:FN #\>)) |_if_|
  (1 (:OP :PIVOTAL #\RING_OPERATOR) (:FN #\+)))) ; Evaluation aborted on #<SIMPLE-ERROR "No function found to the left of lateral operator ~a." {102957F593}>.

The (:print-tokens) option shows us the tokens generated by the lexer. They are determined one by one and pushed into a list, then sent to the compiler in reverse order which is convenient for APL's right-to-left evaluation. The tokens look fine, it appears that something is going wrong with the iteration of the compiler loop that processes the parenthesized expression. It should be expecting a function inside the parentheses but it isn't and it tries to pass a value as the left operand of ¨. It's something involving the passage of metadata across compiler loop iterations, so I'll find out where the missed connection is.

Finally:

* (april-f (with (:compile-only))
           "(Inc _if_ (>∘0)) 1
            (Inc _if_ (>∘0)) 1
            (Inc _if_ (>∘0)) 1")
(IN-APRIL-WORKSPACE COMMON
  (LET ((OUTPUT-STREAM *STANDARD-OUTPUT*))
    (DECLARE (IGNORABLE OUTPUT-STREAM))
    (SYMBOL-MACROLET ((INDEX-ORIGIN ⊑*INDEX-ORIGIN*)
                      (PRINT-PRECISION ⊑*PRINT-PRECISION*)
                      (COMPARISON-TOLERANCE ⊑*COMPARISON-TOLERANCE*)
                      (DIVISION-METHOD ⊑*DIVISION-METHOD*)
                      (RNGS ⊑*RNGS*))
      (TAGBODY NIL NIL)
      (A-OUT NIL :PRINT-PRECISION PRINT-PRECISION :PRINT-TO OUTPUT-STREAM))))

This is another bug. April generates (tagbody)s in the course of implementing branch statements with . For example:

* (april (with (:compile-only)) "x←1 ⋄ →1+1 ⋄ x×←11 ⋄ 1→⎕ ⋄ x×←3 ⋄ 2→⎕ ⋄ x×←5 ⋄ 3→⎕ ⋄ x×←7")
(IN-APRIL-WORKSPACE COMMON
  (LET ((OUTPUT-STREAM *STANDARD-OUTPUT*))
    (DECLARE (IGNORABLE OUTPUT-STREAM))
    (SYMBOL-MACROLET ((INDEX-ORIGIN ⊑*INDEX-ORIGIN*)
                      (PRINT-PRECISION ⊑*PRINT-PRECISION*)
                      (COMPARISON-TOLERANCE ⊑*COMPARISON-TOLERANCE*)
                      (DIVISION-METHOD ⊑*DIVISION-METHOD*)
                      (RNGS ⊑*RNGS*))
      (TAGBODY
        (A-SET ⊑|x| 1)
        (LET ((#:A10666 (VRENDER (A-CALL (APL-FN-S +) 1 1))))
          (COND ((= #:A10666 3) (GO #:AB10661)) ((= #:A10666 2) (GO #:AB10654))
                ((= #:A10666 1) (GO #:AB10647))))
        (LET ((#:G10643 (A-CALL (APL-FN-S ×) 11 ⊑|x|)))
          (WHEN (BOUNDP '⊑|x|) (SETF (SYMBOL-VALUE '⊑|x|) #:G10643))
          (SETQ ⊑|x| #:G10643))
       #:AB10647
        (LET ((#:G10650 (A-CALL (APL-FN-S ×) 3 ⊑|x|)))
          (WHEN (BOUNDP '⊑|x|) (SETF (SYMBOL-VALUE '⊑|x|) #:G10650))
          (SETQ ⊑|x| #:G10650))
       #:AB10654
        (LET ((#:G10657 (A-CALL (APL-FN-S ×) 5 ⊑|x|)))
          (WHEN (BOUNDP '⊑|x|) (SETF (SYMBOL-VALUE '⊑|x|) #:G10657))
          (SETQ ⊑|x| #:G10657))
       #:AB10661)
      (A-OUT
       (LET ((#:G10664 (A-CALL (APL-FN-S ×) 7 ⊑|x|)))
         (WHEN (BOUNDP '⊑|x|) (SETF (SYMBOL-VALUE '⊑|x|) #:G10664))
         (SETQ ⊑|x| #:G10664))
       :PRINT-PRECISION PRINT-PRECISION))))

But something is clearly going wrong. Time to check that out as well.

Thanks for your detailed account of the problems. Comparing April's output to TryAPL is the right way to proceed in cases like this - as long as you're not examining exclusive April features like the k-style $[ ; ; ] if-statement syntax.

phantomics commented 6 months ago

Update: all these problems should be fixed. They had a common cause: you were using a pivotal operator (an operator taking two operands, like , or @) and the right operand was another pivotal composition that had a value as its right operand: (>∘0). This was affected by a bug stemming from the fact that pivotal operators with a value as right operand are an unusual case in APL compilation. They return a function but the first thing the compiler encounters is a value and I didn't account for this in the part of the compiler that resolves functions. That has now been fixed and all the examples you posted should now work properly. Give it a try and let me know if you encounter any other issues.

justin2004 commented 6 months ago

love the postmortems on bugs, @phantomics !

sjl commented 6 months ago

Thanks! Most of those have been fixed, but this one is still broken:

(april-f "_if_ ← { (⍺⍺⍣(⍵⍵ ⍵))⍵ }
          (+∘1) _if_ (>∘0) 100")
#<FUNCTION (LABELS APRIL::∇SELF :IN APRIL::∇OSELF) {104E93F65B}>

TryAPL returns what I expect for that:

  _if_ ← { (⍺⍺⍣(⍵⍵ ⍵))⍵ }
  (+∘1) _if_ (>∘0) 100
101
phantomics commented 6 months ago

That one had a different cause, (:compile-only) tells the tale:

* (april (with (:compile-only)) "(+∘1) _if_ (>∘0) 100")
(IN-APRIL-WORKSPACE COMMON
  (LET ((OUTPUT-STREAM *STANDARD-OUTPUT*))
    (DECLARE (IGNORABLE OUTPUT-STREAM))
    (SYMBOL-MACROLET ((INDEX-ORIGIN ⊑*INDEX-ORIGIN*)
                      (PRINT-PRECISION ⊑*PRINT-PRECISION*)
                      (COMPARISON-TOLERANCE ⊑*COMPARISON-TOLERANCE*)
                      (DIVISION-METHOD ⊑*DIVISION-METHOD*)
                      (RNGS ⊑*RNGS*))
      (A-OUT
       (A-COMP :OP ⊑|_if_|
               (A-COMP ∘ OPERATE-BESIDE (SUB-LEX 1) (SUB-LEX (APL-FN-S +)))
               (A-CALL
                (A-COMP ∘ OPERATE-BESIDE (SUB-LEX 0)
                        (SUB-LEX (APL-FN-S > COMPARISON-TOLERANCE)))
                100))
       :PRINT-PRECISION PRINT-PRECISION :PRINT-TO OUTPUT-STREAM))))

The compiler thought that (>∘0) 100 is the right operand, not breaking off the composition of the right value when the (>∘0) function composition occurs. This case was overlooked in the value constructor and is now fixed and tested.

One more thing, if you're just doing something that will return a scalar like (april-f "+∘1 _if_ Pos 100"), you may want to use (april) rather than (april-f) because you probably don't have much interest in seeing its APL-formatted representation.