Many benchmarks require configuration of environment variables (for example, to set the working directory, to load libraries through environment modules or virtualenvs, or for direct consumption by the benchmark). This is a very common pattern, but non-trivial to implement in a benchmark class and entirely inaccessible to GenericBenchmark.
Configuring the environment in the shell and allowing Mantis and its subprocesses to inherit it is sufficient for running single benchmarks, but will not work well for multiple benchmarks (since they may have conflicting requirements) and reduces scientific replicability (since important metadata goes unrecorded).
Therefore, Mantis should have some mechanism to handle this internally. Unfortunately, many academic benchmarks come with some form of 'setup' shell script, so we probably need to be able to work with arbitrary shell commands.
I propose the following API changes:
Benchmark classes should implement a get_run_env() function (default: return None) returning an environment mapping, which collectors will supply to their subprocess.run calls when running benchmarks.
Benchmark should implement a _prep(commands, env=None) function which accepts one or more shell commands, runs them in series, and returns the resulting environment mapping. (We can read it from a final env command redirected somewhere convenient.)
Documentation: Python's subprocess is sh, not bash. Note that it's necessary to use ., not source.
Implementation question: How should we handle a command returning a non-zero exit status? Continue, raise, raise but handle in monitor and move on to the next benchmark? (See #34.)
GenericBenchmark should accept configuration options before_all/before_each/after_each/after_all, each of whose values should be a list of shell commands; it will implement the corresponding functions by calling _prep on the list and saving the resulting environment mapping to be returned from get_run_env. (The same mapping can be used for subsequent calls to _prep, giving each benchmark instance an intuitively 'persistent' environment.)
Other utility functions and corresponding GenericBenchmark configuration—e.g. setting the working directory, setting a list of environment variables directly, or using specified lmod modules—would not add power, but may be worth providing for convenience.
Many benchmarks require configuration of environment variables (for example, to set the working directory, to load libraries through environment modules or virtualenvs, or for direct consumption by the benchmark). This is a very common pattern, but non-trivial to implement in a benchmark class and entirely inaccessible to
GenericBenchmark
.Configuring the environment in the shell and allowing Mantis and its subprocesses to inherit it is sufficient for running single benchmarks, but will not work well for multiple benchmarks (since they may have conflicting requirements) and reduces scientific replicability (since important metadata goes unrecorded).
Therefore, Mantis should have some mechanism to handle this internally. Unfortunately, many academic benchmarks come with some form of 'setup' shell script, so we probably need to be able to work with arbitrary shell commands.
I propose the following API changes:
Benchmark
classes should implement aget_run_env()
function (default:return None
) returning an environment mapping, which collectors will supply to theirsubprocess.run
calls when running benchmarks.Benchmark
should implement a_prep(commands, env=None)
function which accepts one or more shell commands, runs them in series, and returns the resulting environment mapping. (We can read it from a finalenv
command redirected somewhere convenient.)subprocess
is sh, not bash. Note that it's necessary to use.
, notsource
.monitor
and move on to the next benchmark? (See #34.)GenericBenchmark
should accept configuration optionsbefore_all
/before_each
/after_each
/after_all
, each of whose values should be a list of shell commands; it will implement the corresponding functions by calling_prep
on the list and saving the resulting environment mapping to be returned fromget_run_env
. (The same mapping can be used for subsequent calls to_prep
, giving each benchmark instance an intuitively 'persistent' environment.)Other utility functions and corresponding
GenericBenchmark
configuration—e.g. setting the working directory, setting a list of environment variables directly, or using specified lmod modules—would not add power, but may be worth providing for convenience.Thoughts?