Tests fail after updating buttercup - Githubissues

jorgenschaefer / emacs-buttercup

Behavior-Driven Emacs Lisp Testing

GNU General Public License v3.0

363 stars 44 forks source link

Tests fail after updating buttercup #218

Closed noctuid closed 2 years ago

noctuid commented 2 years ago

After updating buttercup, all tests fail (Emacs 29.0.50 built a while ago; I also rebuilt from master today and then tested again with the same result). For example, if I take the first example from writing tests:

(describe "A suite"
  (it "contains a spec with an expectation"
    (expect t :to-be t)))

and run buttercup-run-at-point on it, the output shows:

Running 1 specs.

A suite
  contains a spec with an expectation
[32m  contains a spec with an expectation[0m (1.32ms)

Ran 1 specs, [32m0 failed[0m, in 2.60ms.

The full output when running make test

Running 1 specs.

A suite
  contains a spec with an expectation  FAILED (2.06ms)

========================================
A suite contains a spec with an expectation

Traceback (most recent call last):
  normal-top-level()
  command-line()
  command-line-1(("--title" "makem.sh: once (sandbox: ./sandbox/29.0.50)" "-...
  buttercup-run()
  buttercup--run-suites((#s(buttercup-suite "A suite" nil passed nil nil (25...
  mapc(buttercup--run-suite (#s(buttercup-suite "A suite" nil passed nil nil...
  buttercup--run-suite(#s(buttercup-suite "A suite" nil passed nil nil (2524...
  buttercup--run-spec(#s(buttercup-spec "contains a spec with an expectation...
  buttercup--update-with-funcall(#s(buttercup-spec "contains a spec with an ...
  apply(buttercup--funcall #[0 "\3001\0\3011\0\302 \210\303\304\305\306#00\2...
  buttercup--funcall(#[0 "\3001\0\3011\0\302 \210\303\304\305\306#00\2070\30...
  apply(#[0 "\3001\0\3011\0\302 \210\303\304\305\306#00\2070\307\310\"\207\...
  #[0 "\3001\0\3011\0\302 \210\303\304\305\306#00\2070\307\310\"\207\311\31...
  buttercup-expect(#[0 "\300 \210\301\207" [buttercup--mark-stackframe t] 1]...
  cl--assertion-failed((cl-every #'buttercup--wrapper-fun-p (cons arg args))...
error: (cl-assertion-failed ((cl-every #'buttercup--wrapper-fun-p (cons arg args)) nil (#[0 "\300 \210\301\207" [buttercup--mark-stackframe t] 1] #[0 "\300 \210\301\207" [buttercup--mark-stackframe t] 1])))

Ran 1 specs, 1 failed, in 4.58ms.
Debugger entered--Lisp error: (buttercup-run-specs-failed "")
  signal(buttercup-run-specs-failed (""))
  buttercup--run-suites((#s(buttercup-suite :description "A suite" :parent nil :status passed :failure-description nil :failure-stack nil :time-started (25244 58656 22501 982000) :time-ended (25244 58656 26389 998000) :children (#s(buttercup-spec :description "contains a spec with an expectation" :parent #2 :status failed :failure-description (error (cl-assertion-failed (... nil ...))) :failure-stack ((t normal-top-level) (t command-line) (t command-line-1 ("--title" "makem.sh: once (sandbox: ./sandbox/29.0.50)" "--eval" "(setq load-prefer-newer t)" "--eval" "(setq user-emacs-directory (file-truename \"./sandb..." "--load" "package" "--eval" "(setq package-user-dir (expand-file-name \"elpa\" us..." "--eval" "(setq user-init-file (file-truename \"./sandbox/29...." "-l" "/tmp/tmp.tIfhJGFDGx" "-L" "." "-L" "tests" "--load" "tests/test-once" "-f" "buttercup-run")) (t buttercup-run) (t buttercup--run-suites #1 nil) (t mapc buttercup--run-suite #1) (t buttercup--run-suite #2) (t buttercup--run-spec #4) (t buttercup--update-with-funcall #4 #f(compiled-function () #<bytecode 0x5c0d57bda6f0baa>)) (t apply buttercup--funcall #f(compiled-function () #<bytecode 0x5c0d57bda6f0baa>) nil) (t buttercup--funcall #f(compiled-function () #<bytecode 0x5c0d57bda6f0baa>)) (t apply #f(compiled-function () #<bytecode 0x5c0d57bda6f0baa>) nil) (t #f(compiled-function () #<bytecode 0x5c0d57bda6f0baa>)) (t buttercup-expect #f(compiled-function () #<bytecode 0x1f41e72b98cd>) :to-be #f(compiled-function () #<bytecode 0x1f41e72b98cd>)) (t cl--assertion-failed (cl-every ... ...) nil (...) nil)) :time-started (25244 58656 22507 316000) :time-ended (25244 58656 26389 379000) :function #f(compiled-function () #<bytecode 0x5c0d57bda6f0baa>))) :before-each nil :after-each nil :before-all nil :after-all nil)) nil)
  buttercup-run()
  command-line-1(("--title" "makem.sh: once (sandbox: ./sandbox/29.0.50)" "--eval" "(setq load-prefer-newer t)" "--eval" "(setq user-emacs-directory (file-truename \"./sandb..." "--load" "package" "--eval" "(setq package-user-dir (expand-file-name \"elpa\" us..." "--eval" "(setq user-init-file (file-truename \"./sandbox/29...." "-l" "/tmp/tmp.tIfhJGFDGx" "-L" "." "-L" "tests" "--load" "tests/test-once" "-f" "buttercup-run"))
  command-line()
  normal-top-level()

snogge commented 2 years ago

Could you share the project you have problems with? Or the complete minimal reproducer including makefile and file structure? Also, do you know which version of buttercup you had before the update?

noctuid commented 2 years ago

I see now that it actually says "0 failed". I didn't notice with the [32m and thought it just said "failed". Did the buttercup-run-at-point output always like this?

Sorry for the noise. I will investigate further and raise an issue in the relevent repo if necessary (makem).

noctuid commented 2 years ago

Here is a bare minimum repo that demonstrates the issue. You can ignore makem. I can reproduce without it.

It looks like byte compiling the test file causes the issue (makem does this by default to make sure files compile cleanly). Is is it expected for compiling the test file to cause issues?

This works:

emacs -Q --batch --eval "(setq load-prefer-newer t)" \
      -L /path/to/elpa/buttercup-20220410.1557 -L . \
      -L tests  --load tests/test-foo -f buttercup-run

But this fails:

emacs -Q --batch --eval "(setq load-prefer-newer t)" \
      -L /path/to/elpa/buttercup-20220410.1557 -L . \
      -L tests --eval "(byte-compile-file \"tests/test-foo.el\")" \
      --load tests/test-foo -f buttercup-run

Error output

»  emacs -Q --batch --eval "(setq load-prefer-newer t)" -L /home/noctuid/testme/sandbox/29.0.50/elpa/buttercup-20220410.1557 -L . -L tests  --load tests/test-foo -f buttercup-run
Running 1 specs.

A suite
  contains a spec with an expectation (9.68ms)

Ran 1 specs, 0 failed, in 12.49ms.
» emacs -Q --batch --eval "(setq load-prefer-newer t)" -L /home/noctuid/testme/sandbox/29.0.50/elpa/buttercup-20220410.1557 -L . -L tests --eval "(byte-compile-file \"tests/test-foo.el\")" --load tests/test-foo -f buttercup-run

Running 1 specs.

A suite
  contains a spec with an expectation  FAILED (1.97ms)

========================================
A suite contains a spec with an expectation

Traceback (most recent call last):
  normal-top-level()
  command-line()
  command-line-1(("--eval" "(setq load-prefer-newer t)" "-L" "/home/noctuid/...
  buttercup-run()
  buttercup--run-suites((#s(buttercup-suite "A suite" nil passed nil nil (25...
  mapc(buttercup--run-suite (#s(buttercup-suite "A suite" nil passed nil nil...
  buttercup--run-suite(#s(buttercup-suite "A suite" nil passed nil nil (2524...
  buttercup--run-spec(#s(buttercup-spec "contains a spec with an expectation...
  buttercup--update-with-funcall(#s(buttercup-spec "contains a spec with an ...
  apply(buttercup--funcall #[nil "\30113021\303 \210\304\305\306\307#00...
  buttercup--funcall(#[nil "\30113021\303 \210\304\305\306\307#00\2070...
  apply(#[nil "\30113021\303 \210\304\305\306\307#00\2070\310\31\")\...
  #[nil "\30113021\303 \210\304\305\306\307#00\2070\310\31\")\207\3...
  buttercup-expect(#[nil "\300 \210\301\207" [buttercup--mark-stackframe t] ...
  cl--assertion-failed((cl-every #'buttercup--wrapper-fun-p (cons arg args))...
error: (cl-assertion-failed ((cl-every #'buttercup--wrapper-fun-p (cons arg args)) nil (#[nil "\300 \210\301\207" [buttercup--mark-stackframe t] 1] #[nil "\300 \210\301\207" [buttercup--mark-stackframe t] 1])))

Ran 1 specs, 1 failed, in 5.88ms.

Error: buttercup-run-specs-failed ("")
  mapbacktrace(#f(compiled-function (evald func args flags) #<bytecode 0x1a0334c9cbae87df>))
  debug-early-backtrace()
  debug-early(error (buttercup-run-specs-failed ""))
  signal(buttercup-run-specs-failed (""))
  buttercup--run-suites((#s(buttercup-suite :description "A suite" :parent nil :status passed :failure-description nil :failure-stack nil :time-started (25247 21158 714988 810000) :time-ended (25247 21158 718297 850000) :children (#s(buttercup-spec :description "contains a spec with an expectation" :parent #1 :status failed :failure-description (error (cl-assertion-failed ((cl-every #'buttercup--wrapper-fun-p (cons arg args)) nil (#f(compiled-function () #<bytecode 0x1dc1c3ad280d>) #f(compiled-function () #<bytecode 0x1dc1c3ad280d>))))) :failure-stack ((t normal-top-level) (t command-line) (t command-line-1 ("--eval" "(setq load-prefer-newer t)" "-L" "/home/noctuid/testme/sandbox/29.0.50/elpa/buttercup-20220410.1557" "-L" "." "-L" "tests" "--eval" "(byte-compile-file \"tests/test-foo.el\")" "--load" "tests/test-foo" "-f" "buttercup-run")) (t buttercup-run) (t buttercup--run-suites #0 nil) (t mapc buttercup--run-suite #0) (t buttercup--run-suite #1) (t buttercup--run-spec #3) (t buttercup--update-with-funcall #3 #f(compiled-function () #<bytecode -0x927a9dbc4075921>)) (t apply buttercup--funcall #f(compiled-function () #<bytecode -0x927a9dbc4075921>) nil) (t buttercup--funcall #f(compiled-function () #<bytecode -0x927a9dbc4075921>)) (t apply #f(compiled-function () #<bytecode -0x927a9dbc4075921>) nil) (t #f(compiled-function () #<bytecode -0x927a9dbc4075921>)) (t buttercup-expect #f(compiled-function () #<bytecode 0x1dc1c3ad280d>) :to-be #f(compiled-function () #<bytecode 0x1dc1c3ad280d>)) (t cl--assertion-failed (cl-every #'buttercup--wrapper-fun-p (cons arg args)) nil ((#f(compiled-function () #<bytecode 0x1dc1c3ad280d>) #f(compiled-function () #<bytecode 0x1dc1c3ad280d>))) nil)) :time-started (25247 21158 714994 534000) :time-ended (25247 21158 718297 81000) :function #f(compiled-function () #<bytecode -0x927a9dbc4075921>))) :before-each nil :after-each nil :before-all nil :after-all nil)) nil)
  buttercup-run()
  command-line-1(("--eval" "(setq load-prefer-newer t)" "-L" "/home/noctuid/testme/sandbox/29.0.50/elpa/buttercup-20220410.1557" "-L" "." "-L" "tests" "--eval" "(byte-compile-file \"tests/test-foo.el\")" "--load" "tests/test-foo" "-f" "buttercup-run"))
  command-line()
  normal-top-level()
buttercup-run failed: ""
zsh: exit 255   emacs -Q --batch --eval "(setq load-prefer-newer t)" -L  -L . -L tests --eval

alphapapa commented 2 years ago

Just noticed this myself after installing the latest version of Buttercup into one of my test sandboxes. Indeed, the backtrace seems to show that it's failing because of the test file's being byte-compiled, because buttercup--wrapper-fun-p does not seem to accept byte-compiled functions.

Also, I observed the Emacs process going into an infinite loop and consuming infinite memory after upgrading to the latest version of Buttercup and observing this problem, but I don't know if that has the same cause as this bug. I found that setting print-circle to t prevented the infinite memory use, but the infinite loop remained; after terminating the process, the parent Emacs collected Buttercup's output normally.

alphapapa commented 2 years ago

This new function definition seems to work with byte-compiled functions:

(defun buttercup--enclosed-expr (fun)
  "Given a zero-arg function FUN, return its unevaluated expression.

The function MUST be byte-compiled or have one of the following
forms:

\(lambda () EXPR)
\(lambda () (buttercup--mark-stackframe) EXPR)
\(closure (ENVLIST) () EXPR)
\(closure (ENVLIST) () (buttercup--mark-stackframe) EXPR)
\(lambda () (quote EXPR) EXPR)
\(closure (ENVLIST) () (quote EXPR) EXPR)

and the return value will be EXPR, unevaluated. The latter 2
forms are useful if EXPR is a macro call, in which case the
`quote' ensures access to the un-expanded form."
  (pcase fun
    (`(closure ,(pred listp) nil ,expr) expr)
    (`(closure ,(pred listp) nil (buttercup--mark-stackframe) ,expr) expr)
    (`(closure ,(pred listp) nil (quote ,expr) . ,_rest) expr)
    (`(closure ,(pred listp) nil ,_expr . ,(pred identity))
     (error "Closure contains multiple expressions: %S" fun))
    (`(closure ,(pred listp) ,(pred identity) . ,(pred identity))
     (error "Closure has nonempty arglist: %S" fun))
    (`(lambda nil ,expr) expr)
    (`(lambda nil (buttercup--mark-stackframe) ,expr) expr)
    (`(lambda nil (quote ,expr) . ,_rest) expr)
    (`(lambda nil ,_expr . ,(pred identity))
     (error "Function contains multiple expressions: %S" fun))
    (`(lambda ,(pred identity) . ,(pred identity))
     (error "Function has nonempty arglist: %S" fun))
    ((and (pred byte-code-function-p) (guard (member (aref fun 0) '(nil 0))))
     (aref fun 1))
    ((pred byte-code-function-p)
     (error "Byte-compiled function expects arguments: %S" fun))
    (_ (error "Not a zero-arg one-expression closure: %S" fun))))

At least, the testing proceeds past that check now. I'm still trying to figure out why an infinite loop seems to happen after the tests have run...

...okay, the above function definition seems to work properly with byte-compiled functions now.

alphapapa commented 2 years ago

Note that, while tests correctly pass or fail now as expected, tests that fail show the byte-code rather than the uncompiled expression, e.g.

Refuses to link to non-file-backed buffer  Expected `"\303 \210\301\304\300\242\302\305\300\242%\207"' to throw a child signal of `user-error' with args `("Views that search non-file-backed buffers can’t be linked to")', but instead it threw (user-error "Views that search non-file-backed buffers can't be linked to") (5.44ms)

I think this is merely a mild annoyance, though, because the name of the test remains in the output, and if necessary, the developer could narrow down the failing expression by only putting one test in each it form.

Besides, I don't know how one could practically resolve the byte-code of these anonymous functions back to their original expressions. Probably, this just needs to be documented as something to be aware of when running byte-compiled tests.

alphapapa commented 2 years ago

It's weird though: this CI job that was run 8 days ago passed all Buttercup tests: https://github.com/alphapapa/org-ql/runs/6712603976 And Buttercup hasn't been updated since 2 months ago. But now the same code seems to be failing. Is there something more going on here?

snogge commented 2 years ago

There seems to be at least three things going on here

buttercup-run-at-point includes the ANSI color sequences even though it only prints to a buffer. Should be easy to fix.
Byte compiled test files fails in new interesting ways. Could this be a consequence of #216? loadwas not wrapped in buttercup--funcallbefore then. On the other hand it is not the loading that fails, but the execution. buttercup--enclosed-expr is probably due for cleanup as there are some expressions in there that are just not used anymore. In any case, buttercup needs tests for byte compiled test files.
Printing original expressions for byte compiled tests. This is probably solvable with some new trickery in expect. Maybe save the original expression in the docstrings of the lambda functions?

snogge commented 2 years ago

I fixed the ANSI escape sequences for buttercup-run-at-point in 42df1faa653f2765941f478167dafac059dc3a57

snogge commented 2 years ago

Byte compiled tests should work now that #219 has been merged. Created #222 for the remaining problem.

alphapapa commented 2 years ago

@snogge Thank you!