Open bep opened 1 year ago
Do they always fail like that on Windows, or just sometimes?
Do you have a minimal reproducer? Beyond making the investigation easier, it would also serve as a regression test for the eventual fix.
Also, could you share an entire goroutine trace? I think the problem is that we don't set up the recover for failNow
in one of the goroutines.
Do they always fail like that on Windows, or just sometimes?
I don't have a lot of data on this (I have ran it on GitHub 3-4 times, it always fails on Windows; I have since disabled these tests on Windows, and they always passes on Linux/MacOs, which is good), but I will close my eyes and boot up my Windows VM later today and see what happens.
FYI I think the problem is that waitBackground
uses ts.Fatalf
, which is subtly different from ts.t.Fatalf
. The former is our own implementation, which calls panic(failNow)
to mimic the "stop now" behavior from testing.T
. Within a testscript command, ts.Fatalf
should be used to properly use testscript's "fail" logic. In the other setup/teardown code, ts.t.Fatalf
should be used, because in many of those cases, the deferred recover to handle failNow
may not exist.
waitBackground
is used in both scenarios - both by the wait
command, and by the teardown code when the entire script finishes. In the latter case I think that's where you run into issues, because it uses ts.Fatalf
.
Drive-by comment unrelated to the error you are seeing, @bep: I've not had a particularly enjoyable time trying to test server processes using a command to start the server, and then another one to close it. Instead, I've tended to use special files in the testscript itself to indicate that a server should be started for the purposes of a given test, and then tear down happens at the end. This also means that the setup command can more easily wait until the server is known to be ready before starting the script.
can more easily wait until the server is known to be ready before starting the script.
Yes, I'm actually doing that. My current problem is the stopping of the server, which seem to somehow end with ProcessState.Success() == false
on Windows -- even if the test itself passes. I could possibly add a custom "non signal" way to stop the server during tests, as well, but I will try to dig a little further first.
To clarify, I think Paul means starting and stopping your server program from Go code, via testscript's Setup
function. Then you have full control of it: block until it's ready, capture its output, what signal to give it to stop, etc. testscript's own background process support is okay, just slightly limited depending on what you want to do.
This test repo always fails on Windows:
https://github.com/bep/repro-testscript-228
Log here:
https://github.com/bep/repro-testscript-228/actions/runs/4967165173/jobs/8889003644#step:4:75
Not sure why this is falsy on Windows:
OK, I guess this is a what to be expected on Windows when killing the process (as in, no SIGINT). I will find a workaround, but I'll keep this issue open if someone has a fix.
I think Paul means starting and stopping your server program from Go code, via testscript's
Setup
function
Exactly so.
OK, I guess this is a what to be expected on Windows when killing the process (as in, no SIGINT). I will find a workaround, but I'll keep this issue open if someone has a fix.
@bep I'd strongly consider doing the server setup and teardown in the Setup()
function with corresponding use of Defer()
. It will not only make things more robust, but it will also tidy up a number of your scripts to remove the stutter of:
hugo server &
waitServer
...
stop
You can "mark" scripts that expect a server to be running using a special file in the script archive:
-- .run_a_server --
Check for the existence of such a file in the Setup()
function and run a server accordingly.
For a fairly involved example I've recently written https://github.com/cue-lang/cuelang.org/blob/a0df30d1ef4858b51bdfe96b7f6c507a0f08fb81/internal/fsnotify/fsnotify_test.go#L57 (which coincidentally might be a package of interest to Hugo).
It will not only make things more robust, but it will also tidy up a number of your scripts to remove the stutter of:
@myitcv Maybe. The thing is:
hugo server --renderToMemory &
is incredibly powerful and self documenting.
- Having test scripts that's self contained and as close to the "real thing" as possible is valuable to me, and I don't mind a little stutter if I can get that.
- My server tests are mostly to verify that the CLI is wired up correctly, so adding variations with
hugo server --renderToMemory &
is incredibly powerful and self documenting.
Just to highlight that you don't necessarily lose any capabilities with the special file approach. You could just as easily put the arguments to the server in the special file, extract them using code like this, then pass the resulting []string
to your command with almost zero special handling required.
incredibly powerful and self documenting
I'm totally with you on this, just sharing my experience of having tested a number of commands where a server (or equivalent mode) became too painful.
As you will notice from the fsnotify
tests I linked above, it also makes it easier to orchestrate commands that can interact more precisely with the running server. @ldemailly that partially addresses the question you asked in the linked issue.
@myitcv I'm a little confused. I see the Setup
and the Defer
-- but how do you pass arguments from the test script to those? Having them in some other "special file" is not what i talk about when I use the term "self contained test scripts".
I'm a little confused. I see the Setup and the Defer -- but how do you pass arguments from the test script to those?
Just to clarify, I'm be talking about the example from fsnotify
. The testscript scripts live within the testdata
directory:
The Setup()
function delegates to:
When each script runs, before Setup()
is called, the script archive is extracted to a temporary directory. The Setup()
function can therefore do things based on the files within the archive, which now (at the time Setup()
is called) exist on disk.
In the fsnotify
example, as you can see in the Setup()
function, it checks for the existence of a file named .batched
in the root of the archive:
Note the use of (*testscript.Env).Cd
for the path to the directory into which the archive has been extracted.
If the file exists, its contents are used to parse a time.Duration
which is then used in the creation of a batched watcher. If the file does not exist, then we create a regular watcher.
Here is an example of one of the scripts in the testdata
directory that contains one of the .batched
special files:
The file .batched
is what I would refer to as a special file within a test archive. Its existence or otherwise, can be used to help signal things to the Setup()
function. In this case it's not just the existence, but the contents of the file that are used.
Hence in my suggestion re the running of a server, you could use the existence of a special file within a testscript script archive to signal that Setup()
should run Hugo in serve mode. Furthermore, you could use the contents of that file to drive the args to serve, in the way that I suggested in https://github.com/rogpeppe/go-internal/issues/228#issuecomment-1546734856
Updated my latest comment with a link to one of the .batched
special files to make the point slightly clearer.
@myitcv thanks for the detailed explanation. OK, I understand how to set it up. The thing is, though, when I compare the 2 scripts below:
hugo server --renderToMemory &
sleep
! stderr .
stop
# I can even start more servers here if I want to.
And:
sleep
! stderr .
-- magicserverfile --
--renderToMemory
I would much, much prefer the first variant any day of the week. And one great thing about it is that the average developer coming in to fix 1 thing in a project he/she has no experience with, can read that script and understand exactly what it does (also, it does not require 1K lines of custom test setup behind the scenes).
This particular issue (#228) is obviously a bug. It may be hard (impossible?) to fix for the testscript.RunMain
commands, but perhaps the Cmds
commands could have support for the &
"operator" and a stop
would run the Defer
? Or something.
My current workaround is a custom stopServer
command that just sends a HEAD
request to a "stop server" endpoint.
At the risk of derailing this bug report still further, just to point out a couple of things:
sleep
! stderr .
I've found sleep
to in general be a very brittle way of waiting for a server to become available. You either need to be very generous on the sleep time which slows down all server tests, or risk it being too small and CI failing intermittently. Hence, when I start such a server in Setup()
, I have the setup phase wait until the server is definitely ready. Then you can elide this step entirely from the script (which again is more robust).
The checking of stderr
will not work for a backgrounded task FWIW. Again, you need a custom command to interact with the server process. That's why in the fsnotify
tests I define a log
command that directly interacts with the "server" (in that case):
I take your point about it being unusual to not have a command at the start of a script, but it's a "problem" that is (again in my experience) relatively easily overcome with a script naming format like serve_*.txtar
and comments that generally appear at the top of each such script explaining what is going on, what is being tested etc.
Again, none of what I am saying is a rule in any way, just sharing my experience from having written a number of such test setups.
I've found sleep to in general be a very brittle way of waiting for a server to become available
Which is why in my real tests have both a waitServer
(which uses polling) and a stopServer
(which uses a HTTP head request + a small timeout), which has proven to be rock stable (even on Windows).
The checking of stderr will not work for a backgrounded task FWIW.
Not that my tests rely on this, but what you say isn't obvious when reading the docs:
If the last token is '&', the program executes in the background. The standard output and standard error of the previous command is cleared, but the output of the background process is buffered — and checking of its exit status is delayed — until the next call to 'wait', 'skip', or 'stop' or the end of the test. Apply the grep command (see above) to the standard error from the most recent exec or wait command.
I suggest, however, that we instead focus on the issue. The testscript package is incredibly useful, it would be even better if everyone didn't have to create elaborate workarounds to test servers, which is a common enough requirement.
I agree that we should fix the panics in any case. Whether or not testscript's support for background commands should be redesigned (or avoided?) is a different matter.
Just to summarise my findings (so people don't have to wade through all of the above to get the gist of it:
unexpected command failure
message.Process.Signal(interrupt)
. There's a ton of research on this on Go's issue tracker, but no great conclusion.
I'm creating this issue in the hope that someone could give me some helping hints before I give in and start up my Windows VM.
The short take is: