jorgenschaefer / emacs-buttercup

Behavior-Driven Emacs Lisp Testing
GNU General Public License v3.0
362 stars 44 forks source link

Running tests with an interactive terminal (noninteractive: nil) #174

Open cpitclaudel opened 4 years ago

cpitclaudel commented 4 years ago

I have tests that can only run when noninteractive is nil (because they use format-mode-line, which just returns "" in noninteractive mode). Is there a way to run these with buttercup?

snogge commented 4 years ago

Try binding noninteractive to true in your test.

cpitclaudel commented 4 years ago

That won't do anything, will it?

cpitclaudel commented 4 years ago
$ emacs -Q --batch --eval '(print (format-mode-line (quote "A")))'

""
$ emacs -Q --batch --eval '(print (let ((noninteractive nil)) (format-mode-line (quote "A"))))'

""

(you meant noninteractive to nil, not t, I think)

snogge commented 4 years ago

I did mean nil. I think it does not work because format-mode-line is a C function. I don't think there is any way around this as long as you run Emacs in batch mode.

cpitclaudel commented 4 years ago

I don't think there is any way around this as long as you run Emacs in batch mode.

I think that's right; I was hoping for a way to tell the buttercup binary to start in non-batch mode (e.g. using an invisible frame).

snogge commented 4 years ago

The buttercup binary (bin/buttercup) is just a bash script.
You can start a daemonized emacs (which has a hidden frame if I remember correctly) and execute elisp on that daemonized instance:

emacs -Q --daemon=foo
emacs -Q --batch --eval "(message \"%s\" (progn (require 'server) (server-eval-at \"foo\" '(format-mode-line (quote \"A\")))))"
emacs -Q --batch --eval "(progn (require 'server) (server-eval-at \"foo\" '(kill-emacs)))"

I have hacked together a way to actually run some simple buttercup tests on the Emacs daemon, but so far have not even started on sending the messages back to the batch Emacs for printing on the terminal.

snogge commented 4 years ago

But for your tests it should be possible to start an Emacs daemon before you run buttercupand then let the individual tests use server-eval-at.

snogge commented 4 years ago

@cpitclaudel, is it OK to close this issue? Or do you think something should be changed in or added to buttercup.el?

cpitclaudel commented 4 years ago

Ideally, buttercup would do the server-eval-at itself, but wishing is easy :) We could close this.

codygman commented 4 years ago

If you want to do this, I run emacs with emacs -nw and use this function I grabbed from evil's evil-tests.el and customized a little:

(defun tests-run ()
  ;; We would like to use `ert-run-tests-batch-and-exit'
  ;; Unfortunately it doesn't work outside of batch mode, and we
  ;; can't use batch mode because we have tests that need windows.
  ;; Instead, run the tests interactively, copy the results to a
  ;; text file, and then exit with an appropriate code.
  (setq attempt-stack-overflow-recovery nil
    attempt-orderly-shutdown-on-fatal-signal nil
    debug nil
    debug-on-error t
    )
  (unwind-protect
      (progn
    (ert-run-tests-interactively t)
    (with-current-buffer "*ert*"
      (append-to-file (point-min) (point-max) "test-results.txt")
      (let ((failed-tests (ert-stats-completed-unexpected ert--results-stats)))
        ;; (log (format "failed tests: %s" (princ failed-tests)))
        ;; (log (format "zerop failed-tests ==  %s" (princ (zerop failed-tests))))
        (log "checking failed tests")
        (if (zerop failed-tests)
        (progn (log "0 failed tests, exiting with code 0")
               (when (not debug) (kill-emacs 0)))
          (progn (log "at least 1 failed test, exiting with code 1")
             (when (not debug) (kill-emacs 1))))
        (log "done checking failed tests")

I soon want to try and get it working with buttercup, but in the short term I might just split out my tests that I can't run with buttercup in batch mode with the function above.

It would be nice if buttercup supported this mode by default for testing evil functions or the one you needed I think.

codygman commented 4 years ago

Do you have your daemon work up anywhere?

My idea was just to use async.el, but I guess it might also startup an emacs without interactive support. That would mean running an emacs daemon which I'm gathering is like emacs -nw and can support tests like this.

I'm getting very annoyed (as in enough to try helping add this feature to buttercup) with not being able to use buttercup for most of my useful tests.

One example of a useful test I can't run with buttercup:

(it "flycheck squiggly appears underneath misspelled putStrLnORAORAORA function"
     (find-file (emacs-d-directory-for "testdata/simple-haskell-project/Main.hs"))
     (replace-string "putStrLn" "putStrLnORAORAORA")
     (save-buffer)
     (sit-for 1)
     (expect (get-char-property (point) 'face) :to-equal 'flycheck-error))

results in:

Codygman's hci Haskell Integration flycheck squiggly appears underneath misspelled putStrLnORAORAORA function

Traceback (most recent call last):
  (perform-replace "putStrLn" "putStrLnORAORAORA" nil nil nil nil nil nil ni...
  (replace-match-maybe-edit "putStrLnORAORAORA" t t nil (69 77 #<buffer Main...
  (replace-match "putStrLnORAORAORA" t t)
  (ask-user-about-lock "/home/cody/hci/testdata/simple-haskell-project/Main....
  (error "Cannot resolve lock conflict in batch mode")
  (signal error ("Cannot resolve lock conflict in batch mode"))
error: (error "Cannot resolve lock conflict in batch mode")
codygman commented 4 years ago

Maybe emacs -nw and capabilities of emacs --daemon aren't equivalent for this specific flycheck case. I tried out the solution of starting an emacs daemon. See the commit/CI error.

Basically it returns nil as the face at point, while the current code I'm using for tests-run from this comment the test passes.

Upon reading the discussion above I thought perhaps emacs -nw and server-eval-at have the same capabilities with regards to fontification and windows because of the hidden frame. Maybe that is still the case and this warning is biting me somehow:

Warning: due to a long standing Gtk+ bug
https://gitlab.gnome.org/GNOME/gtk/issues/221
Emacs might crash when run in daemon mode and the X11 connection is unexpectedly lost.
Using an Emacs configured with --with-x-toolkit=lucid does not have this problem.
Starting Emacs daemon.
snogge commented 4 years ago

Sorry, I can't find the daemon code at the moment. Perhaps I deleted it by accident, which would be a shame.

Following the commit I think you are going about the test the wrong way. Do you really need to verify the whole chain of "edit, wait for flycheck to react, check that the highlighting is correct" in a single test? You are basically redoing the testing of flycheck. Or maybe I just totally misunderstand what you are trying to do and/or the purpose of this specific test.

Is this 'just' about testing a flycheck checker? Because I'm sure there are some patterns in the flycheck repo that you can use. flycheck uses buttercup for at least part of their testing.

codygman commented 4 years ago

Do you really need to verify the whole chain of "edit, wait for flycheck to react, check that the highlighting is correct" in a single test?

Yes. It's an integration test of flycheck with my increasingly complex config.

You are basically redoing the testing of flycheck.

It's easy to think that, but I've had numerous issues with my own config that caused flycheck to stop working. The most recent being when I turned on a flycheck setting with lsp-haskell 1.9, flycheck stopped automatically highlighting correctly and the highlighting only showed up after manually disabling/re-enabling flycheck.

I'm using buttercup where I can to write integration tests from the highest level for the most crucial parts of my emacs config where I've had issues before. At one point I had a very nice config that I depended heavily on, but for some reason things would just break at the worst moments possible and I begun to not trust my system.

I'll also be doing the same with my lsp integration with haskell to ensure both that it's working and I know about performance regressions and will be warned when upgrading the Haskell lsp server for instance, would give worse performance for a use case of mine. Some of those things I might be able to upstream, some might be too specific for me. But hopefully that gives an idea of what I'm attempting so you can gauge if it's important for buttercup to support it in any way.

From the outside though, I understand your perspective and how you could see those tests or using nix to ensure everything is totally reproducible along with very granular git commits and those integration tests to very quickly spot problem.

Is this 'just' about testing a flycheck checker? Because I'm sure there are some patterns in the flycheck repo that you can use.

Good idea. I'll check and see if they have anything like that.

codygman commented 4 years ago

Since it may be of interest, it looks like for some reason (perhaps the same as me) flycheck uses ert for these types of tests. I did find a much better way of waiting on flycheck in their tests, flycheck-ert-wait-for-syntax-checker.

lastquestion commented 4 years ago

IMHO, buttercup is great for unit tests. Integration tests or any tests that require testing the full stack should not exist in buttercup, but in some "other testing framework".

Buttercup == mocha ??? = Selenium / Cypress

For me, I implemented a mini test framework for ??? . I run a subprocess emacs inside tmux frames for full integration tests. It's super powerful because you can send literal key presses, verify anything you like in actual display output by asking tmux take a screenshot if you need, etc. Plus, it's literally identical to what a user would do.

Running -nw just means run without windowing. In that case, you already have different behavior than windowed emacs, but normally it's "not too much different" (except for the rare cases when it is...)

Running daemon and attaching a new client is definitely not identical. Running connected to a daemon means server.el runs, which may not act like you expect. In particular, how server-eval works can be quite brittle. It is not the same as actually executing an expression via M-: or scratch. Also, -f does not work like init.el, nor does -f work like m-: or calling eval directly.

codygman commented 4 years ago

@lastquestion Thank you for the info!

Any public code you can share for your tmux/emacs setup? I imagine verifying things like "functionnnn has red flycheck squiggly under it" could be annoying.

lastquestion commented 4 years ago

@codygman the driver code is https://github.com/lastquestion/explain-pause-mode/blob/master/tests/cases/driver.el, but it's not very documented and kind of still early days. Most of it is related to the package under test, the actual code that runs tmux is mostly here https://github.com/lastquestion/explain-pause-mode/blob/master/tests/cases/driver.el#L176 and there's code to send key presses and such further on. I reuse an existing mechanism that package exposes to communicate with the subprocess using UNIX sockets (which you could also take inspiration from), as well as using SIGUSR1.

Remember, tmux is -nw, so you don't get red squigglies. Two ways to test that: You can write a function that runs inside the emacs under test that reads under point, and checks the face property that it is squiggle. You can connect that to SIGUSR1 for example, and fire a signal to the emacs under test. This avoids eval and m-x, so this mostly is identical to a user experience but not exactly. For example, if you depend on idle timers to fire, etc. this might not work (I did not check if SIGUSR1 resets the idle timer.)

Or you can ask tmux to take a screenshot and then check to see if the text you're checking for squiggles is underlined - which i think the default face for flycheck errors etc is just underline. I didn't do it in that driver.el yet because I am not sure if I need to but I did it previously for another project in another language. It's well documented though just man tmux and search for capture-pane.

This whole discussion is probably a little off topic for this issue 😁 feel free to email me or open an issue on that project about exposing or packaging the test code out into another package or something.

EDIT: if you need to actually test in windowed mode, for example you want to test autocomplete, etc. that's also possible along the same veins, you can start a GUI emacs, then communicate with it by sending keyboard input and taking screenshots. On mac for example this is pretty easy. But now we're wayyyy off topic 😀

antler5 commented 1 year ago

https://github.com/AutumnalAntlers/emacs-buttercup/tree/my-fork

Based on the advice in this issue, I've been using buttercup to test functions that fundamentally depend on frames, posn-at-point, format-mode-line, etc. My script spawns an emacs server, uses emacsclient to run specs, and handles inter-op control flow through a named pipe. The biggest issue I encountered was that emacsclient --eval '(foobar)' doesn't invoke the debugger (eg. when tests are failing), so I had to modify beacon--debugger to catch and filter signals before they're caught by other handlers. I have tests that use :to-throw, so I know those work correctly with my patch, but backtraces do seem to go back a little further than they did without it.

(Don't have it in front of me to copy and paste right now, but they span the seven frames from: (if (not matcher) (progn (progn (or (not args) ...)))) to the calls of buttercup-fail and signal.)