Open Dzordzu opened 1 year ago
Help wanted: Is there anything I've missed here? Use cases and design suggestions would be welcome, especially if you've had practical experience with setup and teardown scripts in other contexts.
I agree, this would be really cool!
I don't think we can do a rust
step, but simply providing a list of commands for setup and teardown would be fascinating. I think the way to do it would be to do something like:
[[profile.default.setup]]
# optional, match by test filter and platform similar to overrides
filter = "..."
platform = "..."
command = "foo bar" # split using shell-words
timeout = "10s" # optional
[[profile.default.teardown]]
# similar
Notes:
!
. Basically, emulate Python's subprocess.Popen(shell=True)
. On Unix, use /bin/sh
, and on Windows, use %COMSPEC%
. (duct_sh
can act as a reference, though we may want to just copy what it's doing rather than pull in duct as a dependency)Some issues to work out:
How should the UI be presented? There can be multiple scripts here for different sets of tests. Maybe something like:
Setup [ 1/3] <setup script>
...
Teardown [ 1/2] <teardown script>
What environment variables should be passed down to setup scripts?
Might need a way to say "only run a script with certain profiles", though maybe that can be solved through other means (e.g. providing scripts a NEXTEST_PROFILE
env var that they can filter on).
Do we run teardown scripts on Ctrl-C? Maybe we do on the first Ctrl-C attempt but not on subsequent ones.
Do we add CLI options to skip the setup and teardown scripts?
How should output be presented? Should it just be passed through or captured? Should every line be prefixed with [Setup]
or similar?
How should exit codes be handled, particularly for teardown scripts? Seems like we should fail if a setup script fails, but warn if a teardown script fails + exit with non-zero.
(Note that I don't plan to work on this unless my employer decides to invest in this.)
Some more thoughts based on discussion:
$CARGO <command>
, not cargo command
. This means that we should either build support for interpolating environment variables, or use an actual shell to execute the command (though in that case, what about Windows? 😬)I'm not familiar with Mercurial. What kind of 'art' do you mean?
"Prior art" is originally a term from patent law. In this context it just means previous attempts to solve similar problems.
Ah, I knew it under the other name (state of the art). Everyday learning something new ;)
My employer has decided to invest in this, at least to get it to an experimental stage. Hoping to get an experimental version out in the next 2 weeks.
Initial support for setup scripts is in #977.
OK, just released experimental support for setup scripts with nextest 0.9.59. Documentation is at https://nexte.st/book/setup-scripts.html.
@Dzordzu @alextes @ns-sjorgedeaguiar (since you liked the post) and others, please try it out and provide feedback in the tracking issue #978!
Do we run teardown scripts on Ctrl-C? Maybe we do on the first Ctrl-C attempt but not on subsequent ones.
I'm for running teardown scripts regardless if the setup script succeded or not and regardless if it was cancelled or not. The docs should be very explicit about teardown scripts being idempotent, meaning they should be prepared to be executed several times, and even if the setup script didn't run to completion, and the teardown script should not fail if the resource it needs to cleanup already was deleted (by the prev. run of the teardown script, for example).
How should output be presented? Should it just be passed through or captured? Should every line be prefixed with [Setup] or similar?
For me, it's fine to leave the output uncaptured, and no prefixing is required. This would be relevant if there was an ability to run several setup scripts in parallel. I'm not sure nextest
needs that level of complexity, but time will tell.
Do we add CLI options to skip the setup and teardown scripts?
Right now, I personally don't have such a use case. It may make sense to wait until there is one?
How should exit codes be handled, particularly for teardown scripts? Seems like we should fail if a setup script fails, but warn if a teardown script fails + exit with non-zero.
I guess this should be configurable for teardown scripts at least. For example, if the teardown needs to shutdown a local docker container and it fails to do that then the testing should be considered "partially successful", a warning should be printed and the exit status should be zero, because the cost of leaking a local docker container isn't high, especially when the docker daemon is shutdown on an ephemeral CI runner after the the tests anyway.
However, there may be cases like running terraform destroy
to delete an expensive EC2 instance in AWS after the tests are done. If such teardown fails users should know about it immediately. An error-level log and a non-zero exit status will be a good primitive way of dealing with that.
So I suggest that by default, nextest
exists with non-zero and an error log with the teardown script fails, but there should be some config knob like
[script.errors]
# If exit code is not specified this means the following action is performed for all non-zero exit codes
action = "warn"
# Override handling of a specific exit code
[script.errors]
code = 255
action = "fail"
Several more things to add. It usually makes sense to run the teardown script right before the setup script to ensure the environment is clean before the setup, otherwise the setup would try to create the resource that already exists.
It should be easy to implement this for the users by moving their scripts to a file and invoking the teardown script at the beginning of the setup script. Therefore I'm not sure nextest needs a config-level capability to configure script dependencies such that the setup script would depend on the teardown script. . but it's something to keep in mind.
It makes sense to have retry configs for the scripts just like for tests. For example, we do terraform apply
during the tests setup (currently inside of each test in Rust) and that may fail in many spectacular ways, so we retry it in code.
What about splitting before_all
and before_each
scripts? For example, all tests may depend on one docker container that would be created just ones in before_all
, but what if every script needs an isolated resource for their own as well? Having such setup/teardown logic outside of the process will be more robust because we are guaranteed the resources in before_each
will be teardown even if the tests abort or they don't properly handle panics/early returns.
It would be great to support before_all
, before_each
, after_each
and after_all
pattern in the same way that the jest runner does. However generally in that runner the before_all/before_each is able to initialise some context that can then be used in the test functions. In the case of nextest there does not seem to be a way to access any context set by the runner itself.
With setup scripts, the main way to pass in context is via environment variables.
before_each
and after_each
aren't really supportable with the current way nextest works. Most people either initialize that context by hand or use a proc macro for that.
Hi, thanks for adding this experimental feature which I was able to use to solve the problem described at https://github.com/nextest-rs/nextest/issues/1466.
I have a script
#!/bin/bash
# Exit with 1 if NEXTEST_ENV isn't defined.
if [ -z "$NEXTEST_ENV" ]; then
exit 1
fi
# Exit with 1 if $1 isnt set.
if [ -z "$1" ]; then
exit 1
fi
# Write out an environment variable to $NEXTEST_ENV.
echo "RUST_MIN_STACK=$1" >> "$NEXTEST_ENV"
And then a bunch of limits added
# Default
[script.add-stack-limit-80000]
command = "scripts/add-stack-limit.sh 80000"
[[profile.default.scripts]]
filter = "all()"
setup = "add-stack-limit-80000"
[script.add-stack-limit-700000]
command = "scripts/add-stack-limit.sh 700000"
[[profile.default.scripts]]
filter = "package(some-expensive-lib)"
setup = "add-stack-limit-700000"
As I have about 12 [script.add-stack-limit-xxxxx]
, it would be nice if setup = "add-stack-limit-700000"
could be written as setup = ["add-stack-limit", "700000"]
Run code/script before any tests / after all the tests
Usecases
Configuration suggestion/examples
With shell code
With rust code