Add Python runner to accumulate JUnit XML formatted logs

jimporter / mettle

A C++20 unit test framework

https://jimporter.github.io/mettle

BSD 3-Clause "New" or "Revised" License

122 stars 12 forks source link

Add Python runner to accumulate JUnit XML formatted logs #37

Closed 12AT7 closed 6 years ago

12AT7 commented 6 years ago

This PR addresses Mettle issue #14 xUnit Driver, and uses the technique of '--output-fd' as suggested by Jim Porter. I do not really expect this PR to be accepted as is, but discussion about it will be helpful.

Mettle cannot generate pretty reports in continuous integration systems, because the standard format used in tools like Jenkins and CircleCI is XML formatted originally by the JUnit test suite, and Mettle cannot write this format. This PR adds a new runner mettle-junit.py, intended to be similar to mettle, but rather than making a pretty console output, it writes a set of XML files that emulate JUnit. This is useful for automatic continuous integration systems, not interactive use where mettle is the ticket.

I have tested this runner on Mettle's own test suite. If all of the unit test binaries are matched by test/test_*, then I ran:

> python ~/mettle/scripts/mettle_junit.py test/test_*

and the set of XML reports appeared in the subdirectory 'reports'. I further processed those reports with

> allure serve reports

which created the web page: selection_003

Allure is standalone software from https://github.com/allure-framework/allure2, but could also be a Jenkins plugin or CircleCI (neither tested yet) I do not know why Mettle's unit tests are failing on my box, but it makes an interesting display.

Extending Mettle's wire protocol would support richer reports. In particular, JUnit supports reporting on each individual assertion, which Mettle also knows how to do also, but that information is not yet transported over the wire protocol.

Because JUnit XML and associated tooling does not supported nested testsuites, and Mettle does, I have just compressed the test path to just one string joined by ..

jimporter commented 6 years ago

This looks really interesting! Overall, if I were to merge something like this, I'd probably want it to be either a) a new kind of logger that the main mettle driver can choose or b) an entirely separate project that's "blessed" by this repo. However, I can understand why you wouldn't want to get into the XML-generation game in C++; it's part of why I never put any effort into making a JUnit logger in the first place! Nevertheless, one of my goals with this project is for the code to be usable as a library for people to easily write their own test drivers (like the mettle driver) as well as their own kinds of test frameworks (e.g. the caliber project, which does compilation tests).

Some other random thoughts: I'm still not 100% sure if I want to keep bencode as the wire protocol for mettle or if I want to go with something like JSON (unlikely, since it's much harder to detect boundaries between JSON messages in general; though we'd probably be ok since the root type would always be an object) or MessagePack (which seems like overkill, to be honest). As you found with the Python support, bencode is somewhat-obscure, so good implementations aren't a guarantee, especially when you consider that mettle is sending a sequence of many bencode messages across the wire. I wouldn't be surprised if that was why some of the Python bencode packages choked on mettle's data.

Extending Mettle's wire protocol would support richer reports. In particular, JUnit supports reporting on each individual assertion, which Mettle also knows how to do also, but that information is not yet transported over the wire protocol.

I'm not really sure what this means... are you suggesting that we'd emit events for expectations that succeed, up to the failing expectation (if it exists)? If so, I don't think there's plumbing in mettle for that; expectation failures are reported entirely through thrown exceptions, so there'd be no way (without some redesign) to report a passing expectation.

As for why some tests are failing, it's entirely possible there are some bugs hidden in the subprocess code. I really need to go over it again with a fine-toothed comb. I've been hoping to get back to working on mettle, but still have a few other projects I'm trying to finish up first.

12AT7 commented 6 years ago

I'm not really sure what this means... are you suggesting that we'd emit events for expectations that succeed, up to the failing expectation (if it exists)? If so, I don't think there's plumbing in mettle for that; expectation failures are reported entirely through thrown exceptions, so there'd be no way (without some redesign) to report a passing expectation.

So I was trying to think through an example of what this would mean, but convinced myself that maybe there is nothing more to do for passing tests. The JUnit XML format does not accept much information about passing tests. It can count the assertions, but that is the only feature I could not support as is, and for most of my unit testing I only have one assertion per test anyway. So counting the number of times expect() is called and putting that in an additional field of the "passed test" event could be consumed by JUnit, but would be only a minor enhancement.

On failing tests, there is a "type" attribute supported in addition to the "message" attribute. Right now, the "message" attribute is filled with the "message" field of the "failed test" bencode event, but I have no content for the "type" attribute. Because Mettle's matchers can print themselves, that string could come across the wire protocol also, alongside the "message", and that would become the JUnit "type" attribute. Or maybe I don't have this model quite right, but there is definitely a type field there that we could use to convey more information about a failing test than just the message alone.

JUnit distinguishes between "failures" (failed assertion) and "errors" (which is apparently a test that did not run for some other reason). I am trying to figure out if the Mettle "failed_file" event is the same thing. I am not sure I have encountered that situation using Mettle yet, so I don't know what it does.

12AT7 commented 6 years ago

However, I can understand why you wouldn't want to get into the XML-generation game in C++; it's part of why I never put any effort into making a JUnit logger in the first place!

So I chose to do it this way in Python because I only had to figure out the wire protocol which was way easier than figuring out how to hack in a new log format class. A built in capability would probably be nicer, but way more work and riskier for an external contributor to accomplish. There are nice C++ classes for writing XML (like https://github.com/nlohmann/json which is great) so that's not much of a problem.

If your goal is to support easy to write custom drivers though, many coders will prefer to write those in Python or other scripting language, for the exact same reasons that I did. Extending Mettle in C++ is just too much work to figure out, when Python does the job just as well. The wire protocol is language agnostic and is a nice way to get out of C++ for parts of a development workflow that do not demand the performance and type safety of C++. If I had written Mettle, then libmettle would probably be exactly the same, but mettle may well have been Python. So it may be that JUnit XML is more appropriately implemented in C++, but the ability to write a custom driver in Python is valuable too.

With just a simple Python script, I have enough JUnit functionality that I can finally take my tests to production with a nice CI dashboard 8-)

12AT7 commented 6 years ago

an entirely separate project that's "blessed" by this repo

And this is basically the definition of a fork. My fork of Mettle for now will probably just track your upstream master branch exactly, but additionally offer this JUnit extension, and a CMake build system, which are totally orthogonal to what you are doing.

jimporter commented 6 years ago

JUnit distinguishes between "failures" (failed assertion) and "errors" (which is apparently a test that did not run for some other reason). I am trying to figure out if the Mettle "failed_file" event is the same thing. I am not sure I have encountered that situation using Mettle yet, so I don't know what it does.

Yeah, "failed_file" would fall under the "error" category (it's basically a catch-all for any error that happens before a test can even start). However, more generally, errors would include any uncaught exception that's not from expect(). There's handling for that here, but they just get counted as regular failures. I've considered distinguishing the two, but I'm not 100% sure it'd be useful to people.

So I chose to do it this way in Python because I only had to figure out the wire protocol which was way easier than figuring out how to hack in a new log format class. ... If your goal is to support easy to write custom drivers though, many coders will prefer to write those in Python or other scripting language, for the exact same reasons that I did.

Custom drivers (and test types!) in other languages are one of the things I wanted to allow for, especially since I probably won't be able to support every kind of output in mettle proper. Still, something like JUnit is probably common enough that it should be built-in (which, in my opinion, means I should write a JUnit logger in C++ so people can just pass it as a flag).

There are nice C++ classes for writing XML (like https://github.com/nlohmann/json which is great) so that's not much of a problem.

My main issue is that package installation for C++ is at least moderately painful (this is part of why mettle bundles debug stringification, expectations, and a suite/test runner into one repo instead of having several independent repos with some glue code), so I don't want to force people to install too much unless I figure out how to ease that pain. I suppose XML is easy enough to write though, so I could just make a simple class for spitting XML out and put it inside mettle somewhere.

And this is basically the definition of a fork.

I was thinking more like an optional add-on rather than a fork. For example, I could imagine a mettle-junit package hosted on PyPI that people can install and use with an already-installed version of mettle.

Another option would be to provide this (or something like it) as an example so that everyone can see how to write their own test runners without having to delve through the mettle source!

12AT7 commented 6 years ago

Jim, should I go ahead and close this PR? I think what we can do in the near term is just for me to clean up my Mettle fork, and anybody who wants to study the example of using the wire protocol can peek at my fork for ideas. In the meantime, that is what I can use for getting my work done.

jimporter commented 6 years ago

Yeah, let's close this, although I think there's a lot of value in this code that I'd like to incorporate in some way into mettle. In particular, I'd like to add a JUnit-style logger as well as provide documentation/examples of how to work with the wire format for mettle.

On that note, would you find it useful if I made a Python package that parses (and emits) the mettle wire format? That way, anyone would be able to do something like this without having to figure out the format or deal with low-quality Python bencode packages.

Thanks again for working on this!

12AT7 commented 6 years ago

Jim, it is true that there are too many poor bencode implementations in Python, but better_bencode is so far working just fine with Mettle. I don't really see a need to write yet another bencode Python module, seeing as this existing one is doing the job. I am pretty happy with the way mettle_junit.py does the dispatch already, using better_bencode. I think there are more interesting things for both of us to work on.

I do, however, think that elevating mettle_junit.py to some kind of teaching example would be pretty useful. Not only that, but having the ability to build CI dashboards from Mettle, right now, is a pretty big deal for getting wider adoption for larger C++ projects. I needed this feature for my work, I needed it now, and now I have it. Without it, users might just use Catch or Google Test (both inferior to Mettle on the C++ side) just because of the better CI support. I know that you have plans for doing it "right", but "right now" is also useful. By the way, I have mettle_junit.py working great with both a CircleCI-2.0 project (great service) and also a Bamboo CI project (terrible software). My problems regarding this issue are essentially solved.

12AT7 commented 6 years ago

Jim, the next issue that would affect me the most that I cannot really work out by myself, is #24, which involves adding custom functions to suite_builder<>. This is the new feature I am rooting for the most.

jimporter commented 6 years ago

@JohnGalbraith My plan was to use better_bencode (assuming it actually works well), but to make a higher-level API so that other people wouldn't have to figure out the particulars of how to run test files or how the wire protocol is structured. In other words, I'd provide something like run_test_file and JUnitEncoder.__call__ (plus some code to emit events for people who want to create a new kind of test to run under mettle). This isn't a ton of code, but I think run_test_file is the most non-obvious part, and a Python package would make it easier for me to change the implementation details of this stuff without breaking things for people doing stuff like you're doing.