lambdaisland / kaocha

Full featured next gen Clojure test runner
https://cljdoc.org/d/lambdaisland/kaocha/1.0.861/doc/1-introduction
Eclipse Public License 1.0
797 stars 83 forks source link

Agumenting report/fail-summary output and preserving post-test testable adjustments #260

Closed rbrw closed 2 years ago

rbrw commented 2 years ago

I was attempting to capture a per-test start and end time, so that I could refer to them during reporting. More broadly, I was attempting to augment the normal failure report with any (postgres) log entries that occurred while the test was executing. If nothing else, this was intended for CI systems that only show you a unified log, in order to to allow you to see both the JVM and postgres output that's likely relevant to the failure.

I believe I got fairly close. I was able to assoc a start time into the testable in a pre-test hook, which carries through to the report/fail-summary argument, but an end time assoced in the post test appears to be lost. Assuming I didn't make a mistake, I wondered if it might be plausible to carry the testable returned by post-test hooks through to the fail-summary argument too, or maybe there's some better way to accomplish what I was attempting?

For the reporting, I just copy/pasted the existing report/fail-summary, and added a call to one additional function that extracts/prints any postgres log entries for the relevant period.

If there's not already, it might be nice to have an easier way to just "append" other output to the default failure report. Or perhaps the current reporting code could be relocated to a standalone public function that the fail-summary defmethod calls, so that anyone overriding the defmethod can call the public function directly to handle that part of the output.

alysbrooks commented 2 years ago

@rbrw Thanks for reporting your experiences!

an end time assoced in the post test appears to be lost.

This should definitely work—the official profiling and gc-profiling plugins use the post-test hook this way. If you can share the code, I can take a look.

If there's not already, it might be nice to have an easier way to just "append" other output to the default failure report.

I believe you can just create your own mulitmethod (representing your reporter) with a method for :summary and add it to your list of reporters. Both your fail-summary and the default one should appear in the output. That being said, since you already have a hook, I think using the :post-summary makes more sense. I don't know if that suffices as an "easier way"? It's not a lot of code, but does require some knowledge of how Kaocha works.

In case you missed it, we have a section in the docs that goes over your options for working with fail-summary. It should probably mention the option of using the :post-summary hook.

rbrw commented 2 years ago

Hmm, I saw those docs, but may not have understood them carefully enough yet. In this case all I wanted was a chance to print some additional information for every existing failure directly after the normal kaocha output.

Here's both an example of that, and hopefully a demonstration of the issue: https://github.com/rbrw/kaocha/tree/test-post-hook-assoc. Note that's a branch that just contains a test program, not really a branch of kaocha. Running lein kaocha there should show that the test-start-time key makes it through, but the test-end-time key doesn't.

And that includes an example of the (certainly questionable/fragile) hack that I was using to append the additional output: https://github.com/rbrw/kaocha/blob/test-post-hook-assoc/test/post_hook/kaocha.clj#L18-L31 I felt fairly sure I'd want a better approach, but hadn't gotten to it yet.

Thanks

alysbrooks commented 2 years ago

Ahh, I misunderstood exactly what you were going for. So :post-test-run happens after results are reported for that test. I think we could add a hook :post-test-summary for this case.

I think the non-fragile approach is to:

  1. change your :kaocha/fail-type method to just cover what you want added
  2. create a new multimethod for your reporter along the lines of (defmulti doc :type :hierarchy #'hierarchy/hierarchy)
  3. include both your new method (which is basically a custom reporter) and your usual reporter (no longer being overridden) in the list of reporters.

Hopefully that makes sense?

rbrw commented 2 years ago

I may still not understand entirely, i.e. not sure I understand why I would need a custom fail-type. I want my reporter to print its output (if it detects that there is any after scanning the pg log) after the normal failure output for every test. That is, I want for the normal FAIL: ... reporter to run for the failure, and then mine.

I also wasn't sure I understood the post-test situation -- it looked like it was running right after each test, and that's what I'd need in order to capture the end-time (doesn't have to be precise, just "before the next test starts").

Also, for now, I worked around the end assoc issue by just maintaining the start/end times in a completely independent map in a global atom (assuming, of course, that post-test at least runs between tests).

alysbrooks commented 2 years ago

Thinking of it as a custom fail-type is probably confusing. What you're doing is creating a new reporter that only has a method for :kaocha/fail-type (reporters are multimethods). When there's a test failure, Kaocha will call both your reporter's :kaocha/fail-type method and the default reporter's :kaocha/fail-type method. For other reporting "events," it'll ignore your custom reporter because there's no method defined for those events.

The post-test hook almost runs right after the test, but not quite—the reporter runs first and then the hook runs. A single test run looks something like this:

  1. pre-test-run hook runs
  2. The test itself is run
  3. The :kaocha/fail-type method of your reporters runs
  4. post-test-run hook runs
rbrw commented 2 years ago

Oh, I think I see, I just misunderstood. I'm not extending/augmenting the existing method, I'm creating a completely new one that just has a method for :kaocha/fail-type, and then presumably add it to the reporters in the config file.

And given when the post-test-run hook runs, I suppose I'll have to capture the end time at the top of the reporter.

alysbrooks commented 2 years ago

Yep, exactly! If you have any ideas on how to clarify our docs, I'd appreciate them.

rbrw commented 2 years ago

While trying the defmulti approach, I'm wondering if I'm heading in not quite the right direction, and/or hadn't thought this all the way through. At the very top level, as mentioned, we'd like a way to print just a subset of the postgresql log that covers the time a particular test was running, whenever the test fails.

Which I suppose raises questions about what "the time a particular test was running" means. With the defmulti fail-type approach, it repeated the report once per is test (here, we're using the documentation reporter and the custom one, and there were two failed assertions -- this was for the second one):

FAIL in some.misc-test/random-test (misc_test.clj:216)
something random
expected: false
  actual: false
╭───── Test output ───────────────────────────────────────────────────────
│ ╭───── Postgres output ───────────────────────────────────────────────────
│ │ 2021-11-16 15:01:20.056 CST [16793] ERROR:  relation "nope" does not exist at character 18
│ │ 2021-11-16 15:01:20.056 CST [16793] STATEMENT:  select nope from nope
│ ╰─────────────────────────────────────────────────────────────────────────
│ ╭───── Postgres output ───────────────────────────────────────────────────
│ │ 2021-11-16 15:01:20.056 CST [16793] ERROR:  relation "nope" does not exist at character 18
│ │ 2021-11-16 15:01:20.056 CST [16793] STATEMENT:  select nope from nope
│ ╰─────────────────────────────────────────────────────────────────────────
│ :something-printed-to-out-during-the-failure
╰─────────────────────────────────────────────────────────────────────────

And delimiting the capture by the start/end of each is form (if that's what it was doing) is likely to be much too narrow most of the time, since the assertions are often testing "state after the fact", and any pg errors happened earlier.

I suppose one option might be to haave a way to delimit a "region of interest", but for now, even just capturing the start and end time of each deftest, and printing the related pg log entries (ideally once) for each deftest that has any failed assertions would be quite a bit better than the current situation where (in CI) we see the entire pg log intermixed. Or maybe there's some better approach?

rbrw commented 2 years ago

I'd also originally imagined having the pg output completely separate, at the top level, after the "Test output", but not sure if that'd be feasible/appropriate, e.g.:

FAIL in some.misc-test/random-test (misc_test.clj:216)
something random
expected: false
  actual: false
╭───── Test output ───────────────────────────────────────────────────────
│ :something-printed-to-out-during-the-test
╰─────────────────────────────────────────────────────────────────────────
╭───── Postgres output ───────────────────────────────────────────────────
│ 2021-11-16 15:01:20.056 CST [16793] ERROR:  relation "nope" does not exist at character 18
│ 2021-11-16 15:01:20.056 CST [16793] STATEMENT:  select nope from nope
╰─────────────────────────────────────────────────────────────────────────
rbrw commented 2 years ago

After a bit more consideration, I'd say feel free to close this if you like. I think I need to spend a bit more time pondering what we really want, and thanks for the help.

alysbrooks commented 2 years ago

You're welcome! I think I'll close this once we decide whether we want to add a new hook.

alysbrooks commented 2 years ago

@rbrw Just had a random thought. Do you need an end date at all? If you're printing the Postgres log entries immediately after the test fails, can't you just look for all entries after the test's start time?

rbrw commented 2 years ago

At least in the current arrangement the Postgres server runs completely independently of the test run, and the log for an entire run can be very large. The intent was to only include the regions of the log that are related to failures.

rbrw commented 2 years ago

(...which raises questions about "regions". As mentioned, at a minimum, and without perhaps more explict annotations, it might be at least better (in our current situation) for us to limit the output to the log region covering each failed deftest form, so we'd need those start and end times, but still planning to think about it a bit more.) Thanks again.