jimporter / mettle

A C++20 unit test framework
https://jimporter.github.io/mettle
BSD 3-Clause "New" or "Revised" License
122 stars 12 forks source link

Detecting replicants using --runs #48

Open 12AT7 opened 3 years ago

12AT7 commented 3 years ago

I would like to write a setup() or some other fixture that sets a unique random number seed for each replicant launched with --runs. Hence, if I used --runs 5 then I would get five identical tests, except each test would have a distinct pseudo-random sequence of draws. The five sequences would be the same on every run.

Is this possible?

jimporter commented 3 years ago

How do you want the seed to work for each test within the suite? For a particular run, should a) each test start with the same seed, b) each test generate a new seed for itself, or c) each suite generates a particular seed once and then every test contained within starts with that seed?

12AT7 commented 3 years ago

Hmm, clearly there would be multiple ways to play this, but my initial idea was to gain access to the replicant number in the setup() fixture. This "replicant number" is the same information that prints "Test run [#3/5]" when using --runs; it is the "3" in particular.

I suppose we need to be careful about the terms, or the message gets mixed up. I am assuming that there is one or more "tests" defined in the suite by _.test(...), and each of these entities gets executed five times. These repetitions are called "replicants", so in this example we would draw five replicants, from each test. I believe in this case that setup() also gets executed five times per test, once per replicant.

The setup() would run once per replicant, so each replicant would compute and configure a unique seed (this is the main point of this whole issue). So, the exact same "test" would run five times, each with a different pseudo-random sequence. This is basically the exact definition of statistical replication. I think this is the same spirit as the original use case of --runs, except that instead of checking for variations caused by some external randomness (like network latency or whatever), I am going to force the randomness by numerically adjusting the seed used subsequently to draw numbers. To do this, I need to get some kind of unique value per replicant, like "3" or some similar identifier. Normally, using information like this would be bad unit testing practice, but it feels OK for this specific use case of setting a random number seed when writing Monte Carlo or machine learning tests.

So to answer your specific questions, I think what I want is for every replicant of a specific test to always start with the same seed, generated by a unique identifier like "3". It is OK if different tests in the suite use the same seed, as long as their replicants are different within the test. So it is OK if test1, replicant3 has the same seed as test2, replicant3. But test1, replicant2 must have a different seed than test1, replicant3, and so on.

It is also OK if each test runs a different pseudo-random sequence from each other; I don't care as much about this about this, because the draws are going to affect different stuff in each test. I suppose it is slightly preferable to have completely unique sequences across tests too, but this would not be worth a lot of pain to get if it was hard.

jimporter commented 3 years ago

Well, the easy way is to pick your favorite way to generate a unique seed in setup() and just use that. However, that means that each test within the suite will generate a unique seed. For common cases of testing randomness, that's probably fine in general, but that might not be enough in your case.

If you want to reuse the same seed for each test within the suite, that seems like something that you could solve fairly easily with the (currently nonexistent) setup_suite function. That's on my backlog of ideas to add, but it was waiting for an actual use case (no sense adding features if I can't think of a use for them!). With that added, I think you could do something like the following:

suite<int> my_suite("my random suite", [](auto &_) {
  _.setup_suite([](int &seed) {
    seed = generate_seed();
  });

  _.setup([](int &seed) {
    set_seed(seed);
  });

  _.test("test 1", [](int &) { /* ... */ });
  _.test("test 2", [](int &) { /* ... */ });
  // ...
});

That would run the following, once for each run:

  1. setup_suite()
  2. setup() for test 1
  3. test 1
  4. setup() for test 2
  5. test 2

A similar thing would happen for teardown as well, of course.

Note that setup_suite() has once subtle difference from "do something once per run"; if the suite is parameterized, it will do something once for each of the parameters. That's easy to avoid if you don't want it though; just don't parameterize the suite, and if you do need something to be parameterized, do it in a subsuite.