Fix self-hosted test failures

kanaka commented 3 months ago

It's been a while since I've done a comprehensive self-host test run. I've add a input variable to workflow so that self-hosted mode can be added for manually triggered workflows. The CI run is at: https://github.com/kanaka/mal/actions/runs/10309343464/job/28538806985

Some of these implementations might just need a NO_SELF_HOST setting (such as xslt), but a lot of the failures look like bugs rather than timeouts or memory exhaustion.

Once all these have been fixed or NO_SELF_HOST is added, we should make self-hosted test step enabled by default (it only adds a few minutes overall to the full CI workflow).

UPDATE: self-host mode has been made the default for workflows and the remaining broken implementations already had or have been updated with NO_SELF_HOST (erlang, jq, nasm, xslt, vbs).

Here is the the full list of self-hosted failures with very brief statement of the issue (see comments for more details):

[x] awk
[x] c.2
[x] cpp
[x] common-lisp
[x] dart
[ ] erlang - memory leak (e.g. (sumdown 40000))
[ ] io
[x] java
[ ] jq - empty lists inside functions/closures are failing
[x] kotlin
[ ] nasm - Needs more heap. With increased heap second quasiquote call fails
[x] ps
[x] powershell
[x] python.2
[x] sml
[x] vala
[x] vbs (vbscript) - startup/invocation error
[ ] xslt - memory/recursion limit
[x] swift5

kanaka commented 3 months ago

awk fixed by https://github.com/kanaka/mal/commit/e62567fb8f0105f674e865302115780dced3618d and https://github.com/kanaka/mal/commit/d1afe8ddb549dbfea313bf856648b22fb9ef856d

kanaka commented 3 months ago

python.2 fixed in https://github.com/kanaka/mal/commit/0d32585d6b64866b52c4c3a8b1249c7bf7abfb42

kanaka commented 3 months ago

Added common-lisp. Now that the underlying common-lisp issue with hash-maps is fixed, it reveals a new self-hosted failure (step4 issue with how nil is handled).

kanaka commented 3 months ago

More recent CI run (with awk and python.2 fixed), is here: https://github.com/kanaka/mal/actions/runs/10393189805/job/28780104927

Some of the failures have some common patterns:

c.2, cpp, dart, swift5 all fail in step4 with the same type of error. Appears that the result of the list command is being evaluated again when it shouldn't be. For example:
```
$ make DOCKERIZE=1 MAL_IMPL=cpp repl^step4
mal-user> (list 1)
Uncaught exception: "1" is not applicable
```
java and kotlinappear to have a similar symptom:
```
$ make DOCKERIZE=1 MAL_IMPL=kotlin repl^step4
mal-user> (do (prn 101) 7)
101
101
101
101
7
```
And important to note that using Ctrl-D results in multiple exits from the prompt (more levels after running the broken do call).

Other failure summaries:

common-lisp: in self-hosted mode, nil and () are being conflated (which is interesting because in traditional Lisps, those are often represented the same way):
```
make DOCKERIZE=1 MAL_IMPL=common-lisp repl^mal^step4
mal-user> nil
()
```
jq: immediately exits (with 0 exit code) when trying to launch step2

nasm: complains about Error: Run out of memory for Cons objects. Increase heap_cons_limit. Doubling the cons and array memory array results steps 0 - 6 working. step7 breaks like this:


$ make DOCKERIZE=1 MAL_IMPL=nasm repl^mal^step7
mal-user> (quasiquote ())
()
mal-user> (quasiquote ())
Error: list does not begin with a function
Uncaught exception: (#<unknown value>)
in mal EVAL: (quasiquote ())
``

ps has problems starting at step8:

$ make DOCKERIZE=1 MAL_IMPL=ps repl^mal^step8
mal-user> (defmacro! unless (fn* (pred a b) `(if ~pred ~b ~a)))
<(fn* [& args] (EVAL (nth ast 2) (new-env env (nth ast 1) args)))>
mal-user> (unless 7 8 9)
Uncaught exception: typecheck: get at 8638
in mal EVAL: (unless 7 8 9)

powershell also has an application error in step4 but it manifests differently than symptom effecting cpp, c.2, etc:

$ make DOCKERIZE=1 MAL_IMPL=ps repl^mal^step4
mal-user> (list? (list))
Uncaught exception: can only apply functions
in mal EVAL: (list? (list))

sml in mlton mode appears to have reader/parsing errors that aren't specific enough (just "SyntaxError").
vala: macros don't appear to be evaluating their results in self-hosted mode
xslt: fails with Error: Too many nested template or function calls. The stylesheet may be looping. . Unless there is a way to increase the call stack size, this one will probably just need to be marked NO_SELF_HOST

kanaka commented 3 months ago

@asarhaddon @wasamasa @dubek Anybody want to help me track down some fun self-hosted failures?

kanaka commented 3 months ago

c.2, cpp, dart, and swift5 fixed in https://github.com/kanaka/mal/commit/7b23c7ed61fa8e169d29b381b564df96c8220400

kanaka commented 3 months ago

java and kotlin fixed in https://github.com/kanaka/mal/commit/b31180cfd25086a99d6acec7d17fd0124dd32327 (last position of do was being EVAL'd twice).

kanaka commented 3 months ago

common-lisp fixed in https://github.com/kanaka/mal/commit/40aca13680f184f5ca48504a59a6e1c8f9066335 list? was defined to return true for nil. Added a test to catch that too.

kanaka commented 3 months ago

Powershell fixed in https://github.com/kanaka/mal/commit/78000d6d9ef5b624010a8276b64b5c0987bc8e21. However, we probably want to mark powershell as NO_SELF_HOST because the self-hosted tests take 26 minutes to run in GHA. The next longest pole in the tent is rpython which takes less than 10 minutes for everything (most of it is build time).

dubek commented 3 months ago

I looked at debug logs from jq running repl^mal^step2. It starts processing the expressions from step2_eval.mal, and in the middle of the big (def! EVAL ...) expression I see a halt command which tells it to stop. Not sure what's going on yet.

dubek commented 3 months ago

Okay, found it: jq chokes on (); the following patch circumvents the jq bug:

diff --git a/impls/mal/step2_eval.mal b/impls/mal/step2_eval.mal
index 4d40f942..f5242a8c 100644
--- a/impls/mal/step2_eval.mal
+++ b/impls/mal/step2_eval.mal
@@ -24,7 +24,7 @@

       (list? ast)
       (if (empty? ast)
-        ()
+        (list)
         (let* [a0  (first ast)
                f    (EVAL a0 env)
                args (rest ast)]

Need to add a tests for () somewhere, and then fix it for jq. Not tonight...

dubek commented 3 months ago

But then I run into a new issue in mal step3:

mal-user> (+ 1 2)
Uncaught exception: 'env-find-str' not found

more fun with jq awaits...

kanaka commented 3 months ago

ps/postscript fixed in https://github.com/kanaka/mal/commit/a160866767a6e48a5547d1280fadb4a32b4e9125 Missing a "pop" in the macro? function 😧

kanaka commented 3 months ago

I have a fix for the sml mlton self-host issue: https://github.com/kanaka/mal/commit/4719e58be99871c17ec5aa9643d977cf19051a2d

However, I haven't merged it yet because the test I added to catch the issue in non-self-hosted mode revealed a problem in xslt:

$ make DOCKERIZE=1 repl^xslt^step9
user=> (read-string "(+ 1 2")      ;; note the missing paren in the string
(+ 1)
user=> (+ 1 2
Error: EOF while reading list

For some reason, the read-string core function is returning the partial parsed result rather than conveying the parsing error. I suspect the fix isn't too difficult, but it will requiring groking the xslt implementation at least somewhat. @asarhaddon since you've been in the xslt code before, any chance you could look at this?

asarhaddon commented 3 months ago

Do not expect much from me in the xslt field. A few years ago, I have given up merging eval and evalast.

kanaka commented 3 months ago

Vala is fixed in https://github.com/kanaka/mal/commit/39563fdf7d849ae19065d512fc1ed48c3434f6e8. However, there is a rare memory corruption I discovered at: https://github.com/kanaka/mal/issues/663

kanaka commented 3 months ago

sml fixed in https://github.com/kanaka/mal/commit/e9dd8c53e59b2b8346e65581f9374428a61b24a2
xslt non-self-issue issue revealed by new tests is fixed in: https://github.com/kanaka/mal/commit/a2973ab0b805a0e62c39aaa6caea9771b5d92e19

kanaka commented 3 months ago

For xslt, there is apparently a way to set the recursion depth in saxon via -maxDepth:5000.

dubek commented 3 months ago

Another jq update:

dubek@FISHLAPTOP:~/mal/impls/jq (master *) $ ./run
user> ()
()
user> (do ())
()
user> (if true ())
()
user> (def! f (fn* [] ()))
Cannot iterate over null (null)

So () is supported (it's part of our tests) but not inside the body of functions... Weird, still looking.

kanaka commented 3 months ago

I just added a commit that enables self-hosted mode by default and added NO_SELF_HOST=1 to IMPLS.yaml for jq, nasm, and xslt. This ticket is still applicable, the failures just won't show up in default workflow.

kanaka commented 3 months ago

I've added erlang and vbs to the list since those currently have NO_SELF_HOST=1 but the former ought to work and the latter is reported to work (but there is something wrong with the invocation it appears).

erlang fails like this:

$ make DOCKERIZE=1 MAL_IMPL=erlang test^mal^step4
...
TEST: '(def! DO (fn* (a) 7))' -> ['',] -> SUCCESS (result ignored)
TEST: '(DO 3)' -> ['',7]

vbs fails to start (https://github.com/kanaka/mal/actions/runs/10515504756/job/29135713169):

Running: env STEP=step0_repl MAL_IMPL=vbs RAW=1 ../../runtest.py  --deferrable --optional --start-timeout 60 --test-timeout 180 --no-pty --debug-file ../../test-mal-vbs.debug ../tests/step0_repl.mal -- ../mal/run
Did not receive one of following prompt(s): ['[^\\s()<>]+> ']
Got      : 'Input Error: Unknown option "/mnt/d/a/mal/mal/impls/vbs/../vbs/stepA_mal.vbs" specified.\n'

kanaka commented 3 months ago

So () is supported (it's part of our tests) but not inside the body of functions... Weird, still looking.

That is weird. That will be a good one to add to the tests once you figure it out.

kanaka commented 3 months ago

Erlang has a memory leak (even in non-self-hosted mode). Seems like memory is never reclaimed:

$ make repl^erlang^stepA
user=> (def! sumdown (fn* (N) (if (> N 0) (+ N (sumdown  (- N 1))) 0)))
user=> (sumdown 10000)  ;; watch memory grow in top
user=> (sumdown 10000)
user=> (sumdown 40000)  ;; a few of these will probably exhaust memory and kill the process

OldLiu001 commented 3 months ago

@kanaka Thanks for providing log, vbs fails to start issue has been fixed in #666 .

dubek commented 2 months ago

Next chapter on jq:

This patch (whitespace changes ignored) fixes one bug:

diff --git a/impls/jq/utils.jq b/impls/jq/utils.jq
index 7b0876b7..0ad1ec81 100644
--- a/impls/jq/utils.jq
+++ b/impls/jq/utils.jq
@@ -1,6 +1,6 @@
 def _debug(ex):
     . as $top
-    | ex
+    | (ex + "\n")
     | debug
     | $top;

@@ -45,6 +45,9 @@ def find_free_references(keys):
         | if .kind == "symbol" then
             if keys | contains([$dot.value]) then [] else [$dot.value] end
         else if "list" == $dot.kind then
+            if $dot.value|length == 0 then
+                []
+            else
                 # if - scan args
                 # def! - scan body
                 # let* - add keys sequentially, scan body
@@ -69,6 +72,7 @@ def find_free_references(keys):
                   else
                     [ $dot.values[1:][] | _refs ]
                   end
+                end
         else if "vector" == $dot.kind then
             ($dot.value | map(_refs) | reduce .[] as $x ([]; . + $x))
         else if "hashmap" == $dot.kind then

And now running mal step2 no longer exits, but the mal step2 tests fail:

$ make MAL_IMPL=jq test^mal^step2
make -C impls/mal step2_eval.mal
make[1]: Entering directory '/home/dubek/mal/impls/mal'
make[1]: 'step2_eval.mal' is up to date.
make[1]: Leaving directory '/home/dubek/mal/impls/mal'
(call STEP_TEST_FILES,mal,step2): impls/tests/step2_eval.mal
----------------------------------------------
Testing test^mal^step2; step file: impls/mal/step2_eval.mal, test file: tests/step2_eval.mal
Running: env STEP=step2_eval MAL_IMPL=jq RAW=1 ../../runtest.py  --deferrable --optional --start-timeout 60 --test-timeout 120  ../tests/step2_eval.mal -- ../mal/run
Testing evaluation of arithmetic operations
TEST: '(+ 1 2)' -> ['',3] -> FAIL (line 3):
    Expected : '.*\n3'
    Got      : '(+ 1 2)\nUncaught exception: array ([{"kind":"l...) and string ("\\n") cannot be added '
TEST: '(+ 5 (* 2 3))' -> ['',11] -> FAIL (line 6):
    Expected : '.*\n11'
    Got      : '(+ 5 (* 2 3))\nUncaught exception: array ([{"kind":"l...) and string ("\\n") cannot be added '
TEST: '(- (+ 5 (* 2 3)) 3)' -> ['',8] -> FAIL (line 9):
    Expected : '.*\n8'
    Got      : '(- (+ 5 (* 2 3)) 3)\nUncaught exception: array ([{"kind":"l...) and string ("\\n") cannot be added '
...
...

So there's another thing lurking there.

dubek commented 2 months ago

Silly me, my change with the + "\n" at the debug function causes that second problem. Mal step2 tests pass now, I'll keep testing the other steps.

dubek commented 2 months ago

OK first step towards jq is in PR #668.

Then I discovered another issue in jq: if we introduce this test:

;; Test out-of-order declaration
(def! mul2add3 (fn* (n) (let* (p (mul2 n)) (+ p 3))))
(def! mul2 (fn* (n) (* 2 n)))
(mul2add3 5)
;=>13

Then on the last line instead of 13 we get:

JqMAL Exception :: 'mul2' not found

I assume that let* has a "frozen" environment of that point, so later def! mul2 doesn't modify that env. I need to look into it.

But even after changing the order of functions in env.mal to circumvent this issue, jq times out in mal step6. Not sure if it will be able to self-host at all.

kanaka commented 2 months ago

@dubek yeah, I did a little playing around and I think the problem is that jq envs are not actually mutable. They are more like Clojure maps in that a copy is created when a value is added or remove and the new version is the one that is used. This actually is a legitimate way of treating environments but it's not the intended path for mal. I think we've essentially been relying on self-hosted tests to test this mutable property of mal envs. So we probably need a test like this in step4 tests:

;; Testing that env is actually mutable and that lexical scope binds names (not values)
(def! a 12)
(def! fx (fn* () a))
(fx)
;=>12
(def! a 2000)
(fx)
;=>2000

The above will fail in jq. The final test will return 12 instead of 2000.

But I'm also confident that there is a way to solve this in jq because the jq implementation supports atom types which definitely seem to be mutable in jq. It probably won't be easy to convert env to work like atoms (the handling of atom types doesn't appear to be confined to the atom related functions unfortunately). Maybe you can find an easier workaround that doesn't require a full rewrite of env handling. But if not, we'll probably need to understand fully how the jq implementation is doing atom types and use that mechanism for env handling. 🏋️

wasamasa commented 2 months ago

The above comment looks as if it could close #645

kanaka commented 2 months ago

@wasamasa thanks. I realized I had a branch with that and another env test so I've turned that into a PR (#677) and closed #645

alimpfard commented 1 month ago

I just read through this issue (looking at jq in particular), here are some comments:

@dubek yeah, I did a little playing around and I think the problem is that jq envs are not actually mutable. They are more like Clojure maps in that a copy is created when a value is added or remove and the new version is the one that is used. This actually is a legitimate way of treating environments but it's not the intended path for mal.

Yes, jq does not have mutable state at all, jqmal tries to hide this by late-binding free variables in functions, but modifying an already-bound variable is not really possible in general without a lot of pain-

But I'm also confident that there is a way to solve this in jq because the jq implementation supports atom types which definitely seem to be mutable in jq. It probably won't be easy to convert env to work like atoms (the handling of atom types doesn't appear to be confined to the atom related functions unfortunately). Maybe you can find an easier workaround that doesn't require a full rewrite of env handling. But if not, we'll probably need to understand fully how the jq implementation is doing atom types and use that mechanism for env handling. 🏋️

Atoms have unique identities, it's relatively simple to do the same with bindings (stamp the env entry with an identifier and do a little dance where looking up is equivalent to:

def env_get2($name; $current_env; $fallback_env):
    $name | env_get($current_env) as $v | $v.identity | env_get_by_id($fallback_env) // $v.entry

That said, I don't think the guide actually requires this name-binding behaviour? if this is indeed the intended purpose, I don't see much point in the distinction between atoms and variables?

kanaka commented 3 weeks ago

Add io self-hosted failue in step 0: https://github.com/kanaka/mal/actions/runs/11524734437/job/32085546455?pr=699

kanaka / mal

Fix self-hosted test failures #662