mirage / alcotest

A lightweight and colourful test framework
ISC License
455 stars 80 forks source link

Custom diff support #194

Open Lupus opened 5 years ago

Lupus commented 5 years ago

There is a screenshot in readme showing custom diff, but I can't seem to find an API for that in master. I'm writing a lot of tests which use sexp checker, and when sexps differ only a tiny bit, it's very hard to spot what's wrong with the test case in current output. I wanted to plug in patdiff from Jane Street for custom diffs. Is it possible currently? Or I'm missing something?

Thanks!

hannesm commented 5 years ago

FWIW JS sexplib provides a pp_humthat emits line breaks, thus a diff of two very similar s-expressions is usually much more readable. (and sorry, I don't have an answer for your question)

Lupus commented 5 years ago

Indeed, there is human-oriented pretty-printing, that's what I'm currently using. But alcotest gives me "expected %s, got %s." style of output, which makes it hard to spot the difference even with pretty-printed S-expressions. I want it to give me "diff between expected and actual sexp: %s".

samoht commented 5 years ago

Maybe you can test that the diff is empty and if not print the difference using patdiff? e.g.

expected <empty>, got <pattdiff output>

To do this you need to define testable diffs (with an equal and print function).

Lupus commented 5 years ago

I can come up with my own check function, that compares sexps, and if they differ, pretty print those and run patdiff, which will just produce diff to stdout (even with colors while running tests in verbose mode I believe). If not, just failwith some message. Basically that's what alcotest's check does :) But I though that there is some API for doing that within alcotest, given the custom diff image in readme. Otherwise I'll have to extract this custom check into some library, which I would like to reuse in several repos that need tests...

samoht commented 5 years ago

The image is using a custom diff, that's not handled by alcotest.

It's hard to come up with a proper diff algorithm which works very well on every structure -- although maybe textual diff of pp is enough in most of the cases? I'm happy to include patches doing this in alcotest but we probably need to think a bit of the design goal first.

For instance, do you want a Sexp module in alcotest that prints the diff in case of error or something more generic that can be combined using the testable combinators?

Lupus commented 5 years ago

Textual diff of pp should be good enough, may be with ability to pass custom diff tool via options/environment/config files akin to what git does when user asks to show diff of something. Would be super awesome to be able to dune promote the expected output back to the file, akin to ppx_expect :)

Actually I'm using S-expressions more and more often lately, by just copy-pasting pp and equal for Sexp.t. It's so lightweight and straightforward.

May be an example could be added to show how one could create custom diff output, and see where that goes for a start?

Lupus commented 4 years ago

The more I think about it, the more I realise that support for expect tests similar to ppx_expect directly in alcotest would be awesome.

Alcotest is already capturing output, and even will support capturing output on various platforms after https://github.com/mirage/alcotest/pull/238 is merged. ppx_expect is working only on unix platform, and thus won't be able to run expect tests for js platform for example.

Expect tests are sometimes the most lightweight solution to get some tests going, especially when large types are involved (like ASTs or large JSON bodies), one could just pretty-print them out and have a nice diff produced by dune runtest.

craigfe commented 4 years ago

I was thinking about this just today. A ppx_alcotest seems very sensible, and there are plenty of things it could be adding other than expect tests (deriving testables inline from types, deriving test suite construction boilerplate, adding code locations to check / fail calls etc.).

I also have some ideas in mind for how to better expose diffing to end users. The first step for that will be to abstract the Alcotest.TESTABLE module type (and require existing users to construct them via a functor from the old signature or Alcotest.testable.) Will report back if I get any further with that line of thought.

aantron commented 3 years ago

I want to add a vote in favor of textual diffing of S-expressions.

For example, odoc's parser is tested almost entirely by feeding it input strings, getting the AST results, converting them to S-expressions, pretty-printing them, and checking them against expected S-expressions. This is currently integrated into an overall Alcotest test suite, but it's a little bit awkward, because there is manual code for loading these S-expression files, diffing against them, promoting them, etc. It would be nice if Alcotest had some support for this. There are almost 900 input strings and the same number of files.

This works really well. In agreement with @Lupus, this is practically the only way to massively test any even slightly complicated outputs, especially for data structures whose definitions will change during development. It only requires maintaining a to-S-expression converter, which is easier than a to-string converter. "Normal," non-expect, tests would be extremely cumbersome here, because one would have to (1) maintain the comparison functions, (2) work on making the diffs nice, and (3) maintain a huge number of expected values. Expect testing replaces (1) by string comparison, (2) by string diffs, and (3) by output promotion.

In case of poor diff quality (e.g. a huge number of elements printed on one line), it's only a matter of improving or configuring the S-expression pretty printer.

I always try to use expect tests in my projects now, because it is some orders of magnitude more practical and productive than non-expect testing. Besides the speed, it's also much easier to read and understand the test cases.

aantron commented 3 years ago

To add more detail, during odoc parser development, I would make changes to the parser and the AST type that would break a large proportion of the test cases, quickly patch the S-exp converter, and then quickly work through the promotions in only minutes, reviewing every single diff along the way.

aantron commented 3 years ago

As for ppx_alcotest, I think that should be optional. In some cases, it is nice to have the expected output stored in the test source file like with ppx_expect. However, in other cases, the test output will appear as noise to readers of the tests. I think Alcotest could optionally support storing the test output in files (or concatenated in one file somehow, if that is better for I/O purposes). For example, when writing the odoc tests, I just wanted to see a "table of contents" of all the inputs being tested, and did not care to see the S-expressions unless there was a diff (i.e. a test failure).

Similarly, if I write some tests that dump huge execution traces as their to-be-diffed output, I wouldn't want those traces to appear in the test source files.

For performance reasons, it's important to be able to capture output as strings and in buffers, whenever the Alcotest user is able to provide the output that way.

Lupus commented 3 years ago

It really depends on the size of the output. When it's relatively small it might be more readable inline with the test case itself, if it's AST dump, it makes scrolling through test cases hard. In any case the workflow should stay the same - run tests, glance through the diff, optionally fix code in case of bugs, promote new test output.

I would also be happy with an option to store test outputs separately :blush:

davesnx commented 2 years ago

Jumping on the issue as well. I wanted to change the output of alcotest and render a diff similar to git diff but with expected and actual.

I end-up creating my own check (https://github.com/mirage/alcotest/blob/bb3492901dea03c72b4de6b5660852a020283921/src/alcotest-engine/test.mli#L132) function by importing a check and all dependants from alcotest. I was lucky enough that the failure is triggered by an extension and hook with Printexc.register_printer.

The output looks like this Screenshot 2022-08-24 at 13 09 12

It's a work in progress and half of it is broken: https://github.com/davesnx/styled-ppx/blob/create-snapshot-lib/packages/instant/src/main.ml but will be something I will use in a few places.

Would be nice to have this functionality on alcotest where you could plug a renderer to pretty print the output as I did.

mor1 commented 2 years ago

(Also jumping in with little to add except: would https://github.com/Cjen1/ocaml-tree-diff be at all useful in addressing this? Generic, precise tree-diff implementation that @cjen1 did for his undergrad project. Tested up to 10^4 nodes in the tree IIRC.)

samoht commented 2 years ago

Adding a custom diff printer for alcotest is definitely a good idea.

Lupus commented 2 years ago

The next ask would be a way to promote the new outcome as the correct one :)

We've ended up with the following: some [%test.id] extension gets expanded to test id record, which includes test id itself (created out of current filename and test sequence number), contents of file with expected test content (filename is derived from test id) and filename where the corrected file should be placed. Then some output is being captured in a form of json or plain text string and passed to check function along with that test id record. That check function checks if output matches the expected one, if not - prints diff with patdiff library and saves the new content to the location, emitted by ppx in the test id record. To help with promotions - we've created a small tool that scans the _build dir for files matching certain pattern for corrections and copies the over to the source tree.

So far this works really well. Alcotest is better suited for large test suites than ppx_expect, but expect-style tests are still valuable when there's large output that you don't want to maintain by hand.

yawaramin commented 2 years ago

I'd like to jump in here and suggest an alternative using git diff. I feel like there's no need to reinvent the wheel when we can just outsource the text diffing job to git and because we'll be committing the snapshots into our repos anyway.

Here's the workflow I'm suggesting:

Test code

Adapted from https://github.com/mirage/alcotest#examples

let () =
  let open Alcotest in
  let pp_string = pp string in
  run "Utils" [
    "string-case", [
      test_snapshot "lower case" pp_string (To_test.lowercase "hELLO!");
      test_snapshot "capitalization" pp_string (To_test.capitalize "world.");
    ]

(No PPX needed)

Output snapshot files and their contents

File: snapshot/lower case Content: hello!

File: snapshot/capitalization Content: World.

Diffing and updating

We check the snapshot directory (name configurable) into version control. Then on the next test run if a snapshot differs from the test code output, we print out a coloured git diff (no colour if we detect we're running in CI) and fail the test. To update the snapshot, simply delete the snapshot file and rerun the test.

I've been using basically this method in a project at work for a few months now and it's been working quite well. The git diffs are readable (obviously, because basically everyone is reading them nowadays) and 'good enough' for the majority of use cases. We just need to ensure we pretty-print our data in a way that makes it amenable to diffing i.e. line breaks in sensible places. E.g. you can imagine with say JSON, just a normal JSON pretty-print would be very diffable.