rogpeppe / go-internal

Selected Go-internal packages factored out from the standard library
BSD 3-Clause "New" or "Revised" License
823 stars 67 forks source link

testscript: panic: fail now! in stop command on Windows #228

Open bep opened 1 year ago

bep commented 1 year ago

I'm creating this issue in the hope that someone could give me some helping hints before I give in and start up my Windows VM.

The short take is:

--- FAIL: TestCommands (2.00s)
    --- FAIL: TestCommands/server_render_to_memory (1.11s)
        testscript.go:534: # Test the hugo server command.
            # We run these tests in parallel so let Hugo decide which port to use.
 (1.102s)
            > hugo server &
            > waitServer
            > httpget $HUGOTEST_BASEURL_0 'Title: Hugo Server Test' $HUGOTEST_BASEURL_0
            > ! exists public/index.html
            > ! exists public/mystatic.txt
            > stop
            stop
            [background] hugo.exe server: exit status 1
            [stdout]
            Start building sites … 
            hugo v0.112.0-DEV windows/386 BuildDate=unknown

                               | EN  
            -------------------+-----
              Pages            |  2  
              Paginator pages  |  0  
              Non-page files   |  0  
              Static files     |  1  
              Processed images |  0  
              Aliases          |  0  
              Sitemaps         |  0  
              Cleaned          |  0  

            Built in 91 ms
            Watching for changes in $WORK\{layouts,static}
            Watching for config changes in $WORK\hugo.toml
            Running in Fast Render Mode. For full rebuilds on change: hugo server --disableFastRender
            Web Server is available at http://localhost:1313/ (bind address 127.0.0.1)
            Press Ctrl+C to stop
            FAIL: testscripts\commands\server_render_to_memory.txt:13: unexpected command failure

panic: fail now! [recovered]
    panic: fail now!

goroutine 45 [running]:
testing.tRunner.func1.2({0x1b175c0, 0x14847970})
    C:/hostedtoolcache/windows/go/1.19.9/x64/src/testing/testing.go:1[39](https://github.com/gohugoio/hugo/actions/runs/4961873656/jobs/8879210277?pr=10602#step:17:40)6 +0x2ab
testing.tRunner.func1()
    C:/hostedtoolcache/windows/go/1.19.9/x64/src/testing/testing.go:1399 +0x[41](https://github.com/gohugoio/hugo/actions/runs/4961873656/jobs/8879210277?pr=10602#step:17:42)f
panic({0x1b175c0, 0x14847970})
    C:/hostedtoolcache/windows/go/1.19.9/x64/src/runtime/panic.go:884 +0x1ba
github.com/rogpeppe/go-internal/testscript.(*TestScript).Fatalf(0x149e6900, {0x1d91689, 0x1a}, {0x0, 0x0, 0x0})
    C:/Users/runneradmin/go/pkg/mod/github.com/rogpeppe/go-internal@v1.10.1-0.20230505210500-5150104[43](https://github.com/gohugoio/hugo/actions/runs/4961873656/jobs/8879210277?pr=10602#step:17:44)74f/testscript/testscript.go:1104 +0x1[49](https://github.com/gohugoio/hugo/actions/runs/4961873656/jobs/8879210277?pr=10602#step:17:50)
github.com/rogpeppe/go-internal/testscript.(*TestScript).waitBackground(0x149e[69](https://github.com/gohugoio/hugo/actions/runs/4961873656/jobs/8879210277?pr=10602#step:17:70)00, 0x1)
mvdan commented 1 year ago

Do they always fail like that on Windows, or just sometimes?

Do you have a minimal reproducer? Beyond making the investigation easier, it would also serve as a regression test for the eventual fix.

mvdan commented 1 year ago

Also, could you share an entire goroutine trace? I think the problem is that we don't set up the recover for failNow in one of the goroutines.

bep commented 1 year ago

Do they always fail like that on Windows, or just sometimes?

I don't have a lot of data on this (I have ran it on GitHub 3-4 times, it always fails on Windows; I have since disabled these tests on Windows, and they always passes on Linux/MacOs, which is good), but I will close my eyes and boot up my Windows VM later today and see what happens.

mvdan commented 1 year ago

FYI I think the problem is that waitBackground uses ts.Fatalf, which is subtly different from ts.t.Fatalf. The former is our own implementation, which calls panic(failNow) to mimic the "stop now" behavior from testing.T. Within a testscript command, ts.Fatalf should be used to properly use testscript's "fail" logic. In the other setup/teardown code, ts.t.Fatalf should be used, because in many of those cases, the deferred recover to handle failNow may not exist.

waitBackground is used in both scenarios - both by the wait command, and by the teardown code when the entire script finishes. In the latter case I think that's where you run into issues, because it uses ts.Fatalf.

myitcv commented 1 year ago

Drive-by comment unrelated to the error you are seeing, @bep: I've not had a particularly enjoyable time trying to test server processes using a command to start the server, and then another one to close it. Instead, I've tended to use special files in the testscript itself to indicate that a server should be started for the purposes of a given test, and then tear down happens at the end. This also means that the setup command can more easily wait until the server is known to be ready before starting the script.

bep commented 1 year ago

can more easily wait until the server is known to be ready before starting the script.

Yes, I'm actually doing that. My current problem is the stopping of the server, which seem to somehow end with ProcessState.Success() == false on Windows -- even if the test itself passes. I could possibly add a custom "non signal" way to stop the server during tests, as well, but I will try to dig a little further first.

mvdan commented 1 year ago

To clarify, I think Paul means starting and stopping your server program from Go code, via testscript's Setup function. Then you have full control of it: block until it's ready, capture its output, what signal to give it to stop, etc. testscript's own background process support is okay, just slightly limited depending on what you want to do.

bep commented 1 year ago

This test repo always fails on Windows:

https://github.com/bep/repro-testscript-228

Log here:

https://github.com/bep/repro-testscript-228/actions/runs/4967165173/jobs/8889003644#step:4:75

Not sure why this is falsy on Windows:

https://github.com/rogpeppe/go-internal/blob/a4f6fabd84c5fada757d5a234fc6cb10a5bb9c96/testscript/cmd.go#L540

bep commented 1 year ago

OK, I guess this is a what to be expected on Windows when killing the process (as in, no SIGINT). I will find a workaround, but I'll keep this issue open if someone has a fix.

myitcv commented 1 year ago

I think Paul means starting and stopping your server program from Go code, via testscript's Setup function

Exactly so.

myitcv commented 1 year ago

OK, I guess this is a what to be expected on Windows when killing the process (as in, no SIGINT). I will find a workaround, but I'll keep this issue open if someone has a fix.

@bep I'd strongly consider doing the server setup and teardown in the Setup() function with corresponding use of Defer(). It will not only make things more robust, but it will also tidy up a number of your scripts to remove the stutter of:

hugo server &
waitServer

...

stop

You can "mark" scripts that expect a server to be running using a special file in the script archive:


-- .run_a_server --

Check for the existence of such a file in the Setup() function and run a server accordingly.

myitcv commented 1 year ago

For a fairly involved example I've recently written https://github.com/cue-lang/cuelang.org/blob/a0df30d1ef4858b51bdfe96b7f6c507a0f08fb81/internal/fsnotify/fsnotify_test.go#L57 (which coincidentally might be a package of interest to Hugo).

bep commented 1 year ago

It will not only make things more robust, but it will also tidy up a number of your scripts to remove the stutter of:

@myitcv Maybe. The thing is:

myitcv commented 1 year ago
  • Having test scripts that's self contained and as close to the "real thing" as possible is valuable to me, and I don't mind a little stutter if I can get that.
  • My server tests are mostly to verify that the CLI is wired up correctly, so adding variations with hugo server --renderToMemory & is incredibly powerful and self documenting.

Just to highlight that you don't necessarily lose any capabilities with the special file approach. You could just as easily put the arguments to the server in the special file, extract them using code like this, then pass the resulting []string to your command with almost zero special handling required.

incredibly powerful and self documenting

I'm totally with you on this, just sharing my experience of having tested a number of commands where a server (or equivalent mode) became too painful.

As you will notice from the fsnotify tests I linked above, it also makes it easier to orchestrate commands that can interact more precisely with the running server. @ldemailly that partially addresses the question you asked in the linked issue.

bep commented 1 year ago

@myitcv I'm a little confused. I see the Setup and the Defer -- but how do you pass arguments from the test script to those? Having them in some other "special file" is not what i talk about when I use the term "self contained test scripts".

myitcv commented 1 year ago

I'm a little confused. I see the Setup and the Defer -- but how do you pass arguments from the test script to those?

Just to clarify, I'm be talking about the example from fsnotify. The testscript scripts live within the testdata directory:

https://github.com/cue-lang/cuelang.org/tree/a0df30d1ef4858b51bdfe96b7f6c507a0f08fb81/internal/fsnotify/testdata

The Setup() function delegates to:

https://github.com/cue-lang/cuelang.org/blob/a0df30d1ef4858b51bdfe96b7f6c507a0f08fb81/internal/fsnotify/fsnotify_test.go#L57

When each script runs, before Setup() is called, the script archive is extracted to a temporary directory. The Setup() function can therefore do things based on the files within the archive, which now (at the time Setup() is called) exist on disk.

In the fsnotify example, as you can see in the Setup() function, it checks for the existence of a file named .batched in the root of the archive:

https://github.com/cue-lang/cuelang.org/blob/a0df30d1ef4858b51bdfe96b7f6c507a0f08fb81/internal/fsnotify/fsnotify_test.go#L99-L113

Note the use of (*testscript.Env).Cd for the path to the directory into which the archive has been extracted.

If the file exists, its contents are used to parse a time.Duration which is then used in the creation of a batched watcher. If the file does not exist, then we create a regular watcher.

Here is an example of one of the scripts in the testdata directory that contains one of the .batched special files:

https://github.com/cue-lang/cuelang.org/blob/a0df30d1ef4858b51bdfe96b7f6c507a0f08fb81/internal/fsnotify/testdata/batched_directory_moved_into_watch_tree.txtar#L31-L32

The file .batched is what I would refer to as a special file within a test archive. Its existence or otherwise, can be used to help signal things to the Setup() function. In this case it's not just the existence, but the contents of the file that are used.

Hence in my suggestion re the running of a server, you could use the existence of a special file within a testscript script archive to signal that Setup() should run Hugo in serve mode. Furthermore, you could use the contents of that file to drive the args to serve, in the way that I suggested in https://github.com/rogpeppe/go-internal/issues/228#issuecomment-1546734856

myitcv commented 1 year ago

Updated my latest comment with a link to one of the .batched special files to make the point slightly clearer.

bep commented 1 year ago

@myitcv thanks for the detailed explanation. OK, I understand how to set it up. The thing is, though, when I compare the 2 scripts below:

hugo server --renderToMemory &
sleep
! stderr .
stop
# I can even start more servers here if I want to.

And:

sleep
! stderr .
-- magicserverfile --
--renderToMemory

I would much, much prefer the first variant any day of the week. And one great thing about it is that the average developer coming in to fix 1 thing in a project he/she has no experience with, can read that script and understand exactly what it does (also, it does not require 1K lines of custom test setup behind the scenes).

This particular issue (#228) is obviously a bug. It may be hard (impossible?) to fix for the testscript.RunMain commands, but perhaps the Cmds commands could have support for the & "operator" and a stop would run the Defer? Or something.

My current workaround is a custom stopServer command that just sends a HEAD request to a "stop server" endpoint.

myitcv commented 1 year ago

At the risk of derailing this bug report still further, just to point out a couple of things:

sleep
! stderr .

I've found sleep to in general be a very brittle way of waiting for a server to become available. You either need to be very generous on the sleep time which slows down all server tests, or risk it being too small and CI failing intermittently. Hence, when I start such a server in Setup(), I have the setup phase wait until the server is definitely ready. Then you can elide this step entirely from the script (which again is more robust).

The checking of stderr will not work for a backgrounded task FWIW. Again, you need a custom command to interact with the server process. That's why in the fsnotify tests I define a log command that directly interacts with the "server" (in that case):

https://github.com/cue-lang/cuelang.org/blob/a0df30d1ef4858b51bdfe96b7f6c507a0f08fb81/internal/fsnotify/fsnotify_test.go#L52

I take your point about it being unusual to not have a command at the start of a script, but it's a "problem" that is (again in my experience) relatively easily overcome with a script naming format like serve_*.txtar and comments that generally appear at the top of each such script explaining what is going on, what is being tested etc.

Again, none of what I am saying is a rule in any way, just sharing my experience from having written a number of such test setups.

bep commented 1 year ago

I've found sleep to in general be a very brittle way of waiting for a server to become available

Which is why in my real tests have both a waitServer (which uses polling) and a stopServer (which uses a HTTP head request + a small timeout), which has proven to be rock stable (even on Windows).

The checking of stderr will not work for a backgrounded task FWIW.

Not that my tests rely on this, but what you say isn't obvious when reading the docs:

If the last token is '&', the program executes in the background. The standard output and standard error of the previous command is cleared, but the output of the background process is buffered — and checking of its exit status is delayed — until the next call to 'wait', 'skip', or 'stop' or the end of the test. Apply the grep command (see above) to the standard error from the most recent exec or wait command.

I suggest, however, that we instead focus on the issue. The testscript package is incredibly useful, it would be even better if everyone didn't have to create elaborate workarounds to test servers, which is a common enough requirement.

mvdan commented 1 year ago

I agree that we should fix the panics in any case. Whether or not testscript's support for background commands should be redesigned (or avoided?) is a different matter.

bep commented 1 year ago

Just to summarise my findings (so people don't have to wade through all of the above to get the gist of it: