kanaka / mal

mal - Make a Lisp
Other
10k stars 2.54k forks source link

Fix self-hosted test failures #662

Open kanaka opened 1 month ago

kanaka commented 1 month ago

It's been a while since I've done a comprehensive self-host test run. I've add a input variable to workflow so that self-hosted mode can be added for manually triggered workflows. The CI run is at: https://github.com/kanaka/mal/actions/runs/10309343464/job/28538806985

Some of these implementations might just need a NO_SELF_HOST setting (such as xslt), but a lot of the failures look like bugs rather than timeouts or memory exhaustion.

Once all these have been fixed or NO_SELF_HOST is added, we should make self-hosted test step enabled by default (it only adds a few minutes overall to the full CI workflow).

UPDATE: self-host mode has been made the default for workflows and the remaining broken implementations already had or have been updated with NO_SELF_HOST (erlang, jq, nasm, xslt, vbs).

Here is the the full list of self-hosted failures with very brief statement of the issue (see comments for more details):

kanaka commented 1 month ago

awk fixed by https://github.com/kanaka/mal/commit/e62567fb8f0105f674e865302115780dced3618d and https://github.com/kanaka/mal/commit/d1afe8ddb549dbfea313bf856648b22fb9ef856d

kanaka commented 1 month ago

python.2 fixed in https://github.com/kanaka/mal/commit/0d32585d6b64866b52c4c3a8b1249c7bf7abfb42

kanaka commented 4 weeks ago

Added common-lisp. Now that the underlying common-lisp issue with hash-maps is fixed, it reveals a new self-hosted failure (step4 issue with how nil is handled).

kanaka commented 4 weeks ago

More recent CI run (with awk and python.2 fixed), is here: https://github.com/kanaka/mal/actions/runs/10393189805/job/28780104927

Some of the failures have some common patterns:

Other failure summaries:

kanaka commented 4 weeks ago

@asarhaddon @wasamasa @dubek Anybody want to help me track down some fun self-hosted failures?

kanaka commented 3 weeks ago

c.2, cpp, dart, and swift5 fixed in https://github.com/kanaka/mal/commit/7b23c7ed61fa8e169d29b381b564df96c8220400

kanaka commented 3 weeks ago

java and kotlin fixed in https://github.com/kanaka/mal/commit/b31180cfd25086a99d6acec7d17fd0124dd32327 (last position of do was being EVAL'd twice).

kanaka commented 3 weeks ago

common-lisp fixed in https://github.com/kanaka/mal/commit/40aca13680f184f5ca48504a59a6e1c8f9066335 list? was defined to return true for nil. Added a test to catch that too.

kanaka commented 3 weeks ago

Powershell fixed in https://github.com/kanaka/mal/commit/78000d6d9ef5b624010a8276b64b5c0987bc8e21. However, we probably want to mark powershell as NO_SELF_HOST because the self-hosted tests take 26 minutes to run in GHA. The next longest pole in the tent is rpython which takes less than 10 minutes for everything (most of it is build time).

dubek commented 3 weeks ago

I looked at debug logs from jq running repl^mal^step2. It starts processing the expressions from step2_eval.mal, and in the middle of the big (def! EVAL ...) expression I see a halt command which tells it to stop. Not sure what's going on yet.

dubek commented 3 weeks ago

Okay, found it: jq chokes on (); the following patch circumvents the jq bug:

diff --git a/impls/mal/step2_eval.mal b/impls/mal/step2_eval.mal
index 4d40f942..f5242a8c 100644
--- a/impls/mal/step2_eval.mal
+++ b/impls/mal/step2_eval.mal
@@ -24,7 +24,7 @@

       (list? ast)
       (if (empty? ast)
-        ()
+        (list)
         (let* [a0  (first ast)
                f    (EVAL a0 env)
                args (rest ast)]

Need to add a tests for () somewhere, and then fix it for jq. Not tonight...

dubek commented 3 weeks ago

But then I run into a new issue in mal step3:

mal-user> (+ 1 2)
Uncaught exception: 'env-find-str' not found

more fun with jq awaits...

kanaka commented 3 weeks ago

ps/postscript fixed in https://github.com/kanaka/mal/commit/a160866767a6e48a5547d1280fadb4a32b4e9125 Missing a "pop" in the macro? function 😧

kanaka commented 3 weeks ago

I have a fix for the sml mlton self-host issue: https://github.com/kanaka/mal/commit/4719e58be99871c17ec5aa9643d977cf19051a2d

However, I haven't merged it yet because the test I added to catch the issue in non-self-hosted mode revealed a problem in xslt:

$ make DOCKERIZE=1 repl^xslt^step9
user=> (read-string "(+ 1 2")      ;; note the missing paren in the string
(+ 1)
user=> (+ 1 2
Error: EOF while reading list

For some reason, the read-string core function is returning the partial parsed result rather than conveying the parsing error. I suspect the fix isn't too difficult, but it will requiring groking the xslt implementation at least somewhat. @asarhaddon since you've been in the xslt code before, any chance you could look at this?

asarhaddon commented 3 weeks ago

Do not expect much from me in the xslt field. A few years ago, I have given up merging eval and evalast.

kanaka commented 3 weeks ago

Vala is fixed in https://github.com/kanaka/mal/commit/39563fdf7d849ae19065d512fc1ed48c3434f6e8. However, there is a rare memory corruption I discovered at: https://github.com/kanaka/mal/issues/663

kanaka commented 3 weeks ago
kanaka commented 3 weeks ago

For xslt, there is apparently a way to set the recursion depth in saxon via -maxDepth:5000.

dubek commented 3 weeks ago

Another jq update:

dubek@FISHLAPTOP:~/mal/impls/jq (master *) $ ./run
user> ()
()
user> (do ())
()
user> (if true ())
()
user> (def! f (fn* [] ()))
Cannot iterate over null (null)

So () is supported (it's part of our tests) but not inside the body of functions... Weird, still looking.

kanaka commented 3 weeks ago

I just added a commit that enables self-hosted mode by default and added NO_SELF_HOST=1 to IMPLS.yaml for jq, nasm, and xslt. This ticket is still applicable, the failures just won't show up in default workflow.

kanaka commented 3 weeks ago

I've added erlang and vbs to the list since those currently have NO_SELF_HOST=1 but the former ought to work and the latter is reported to work (but there is something wrong with the invocation it appears).

kanaka commented 3 weeks ago

So () is supported (it's part of our tests) but not inside the body of functions... Weird, still looking.

That is weird. That will be a good one to add to the tests once you figure it out.

kanaka commented 3 weeks ago

Erlang has a memory leak (even in non-self-hosted mode). Seems like memory is never reclaimed:

$ make repl^erlang^stepA
user=> (def! sumdown (fn* (N) (if (> N 0) (+ N (sumdown  (- N 1))) 0)))
user=> (sumdown 10000)  ;; watch memory grow in top
user=> (sumdown 10000)
user=> (sumdown 40000)  ;; a few of these will probably exhaust memory and kill the process
OldLiu001 commented 2 weeks ago

@kanaka Thanks for providing log, vbs fails to start issue has been fixed in #666 .

dubek commented 2 weeks ago

Next chapter on jq:

This patch (whitespace changes ignored) fixes one bug:

diff --git a/impls/jq/utils.jq b/impls/jq/utils.jq
index 7b0876b7..0ad1ec81 100644
--- a/impls/jq/utils.jq
+++ b/impls/jq/utils.jq
@@ -1,6 +1,6 @@
 def _debug(ex):
     . as $top
-    | ex
+    | (ex + "\n")
     | debug
     | $top;

@@ -45,6 +45,9 @@ def find_free_references(keys):
         | if .kind == "symbol" then
             if keys | contains([$dot.value]) then [] else [$dot.value] end
         else if "list" == $dot.kind then
+            if $dot.value|length == 0 then
+                []
+            else
                 # if - scan args
                 # def! - scan body
                 # let* - add keys sequentially, scan body
@@ -69,6 +72,7 @@ def find_free_references(keys):
                   else
                     [ $dot.values[1:][] | _refs ]
                   end
+                end
         else if "vector" == $dot.kind then
             ($dot.value | map(_refs) | reduce .[] as $x ([]; . + $x))
         else if "hashmap" == $dot.kind then

And now running mal step2 no longer exits, but the mal step2 tests fail:

$ make MAL_IMPL=jq test^mal^step2
make -C impls/mal step2_eval.mal
make[1]: Entering directory '/home/dubek/mal/impls/mal'
make[1]: 'step2_eval.mal' is up to date.
make[1]: Leaving directory '/home/dubek/mal/impls/mal'
(call STEP_TEST_FILES,mal,step2): impls/tests/step2_eval.mal
----------------------------------------------
Testing test^mal^step2; step file: impls/mal/step2_eval.mal, test file: tests/step2_eval.mal
Running: env STEP=step2_eval MAL_IMPL=jq RAW=1 ../../runtest.py  --deferrable --optional --start-timeout 60 --test-timeout 120  ../tests/step2_eval.mal -- ../mal/run
Testing evaluation of arithmetic operations
TEST: '(+ 1 2)' -> ['',3] -> FAIL (line 3):
    Expected : '.*\n3'
    Got      : '(+ 1 2)\nUncaught exception: array ([{"kind":"l...) and string ("\\n") cannot be added '
TEST: '(+ 5 (* 2 3))' -> ['',11] -> FAIL (line 6):
    Expected : '.*\n11'
    Got      : '(+ 5 (* 2 3))\nUncaught exception: array ([{"kind":"l...) and string ("\\n") cannot be added '
TEST: '(- (+ 5 (* 2 3)) 3)' -> ['',8] -> FAIL (line 9):
    Expected : '.*\n8'
    Got      : '(- (+ 5 (* 2 3)) 3)\nUncaught exception: array ([{"kind":"l...) and string ("\\n") cannot be added '
...
...

So there's another thing lurking there.

dubek commented 2 weeks ago

Silly me, my change with the + "\n" at the debug function causes that second problem. Mal step2 tests pass now, I'll keep testing the other steps.

dubek commented 2 weeks ago

OK first step towards jq is in PR #668.

Then I discovered another issue in jq: if we introduce this test:

;; Test out-of-order declaration
(def! mul2add3 (fn* (n) (let* (p (mul2 n)) (+ p 3))))
(def! mul2 (fn* (n) (* 2 n)))
(mul2add3 5)
;=>13

Then on the last line instead of 13 we get:

JqMAL Exception :: 'mul2' not found

I assume that let* has a "frozen" environment of that point, so later def! mul2 doesn't modify that env. I need to look into it.

But even after changing the order of functions in env.mal to circumvent this issue, jq times out in mal step6. Not sure if it will be able to self-host at all.

kanaka commented 2 weeks ago

@dubek yeah, I did a little playing around and I think the problem is that jq envs are not actually mutable. They are more like Clojure maps in that a copy is created when a value is added or remove and the new version is the one that is used. This actually is a legitimate way of treating environments but it's not the intended path for mal. I think we've essentially been relying on self-hosted tests to test this mutable property of mal envs. So we probably need a test like this in step4 tests:

;; Testing that env is actually mutable and that lexical scope binds names (not values)
(def! a 12)
(def! fx (fn* () a))
(fx)
;=>12
(def! a 2000)
(fx)
;=>2000

The above will fail in jq. The final test will return 12 instead of 2000.

But I'm also confident that there is a way to solve this in jq because the jq implementation supports atom types which definitely seem to be mutable in jq. It probably won't be easy to convert env to work like atoms (the handling of atom types doesn't appear to be confined to the atom related functions unfortunately). Maybe you can find an easier workaround that doesn't require a full rewrite of env handling. But if not, we'll probably need to understand fully how the jq implementation is doing atom types and use that mechanism for env handling. 🏋️

wasamasa commented 2 weeks ago

The above comment looks as if it could close #645

kanaka commented 2 weeks ago

@wasamasa thanks. I realized I had a branch with that and another env test so I've turned that into a PR (#677) and closed #645