Test Coverage Metrics - Githubissues

pepperbob commented 8 years ago

We're having the rdfunit-junit integration up and running which provides us with a good overview of failing test cases, esp. in conjunction with IDE and/or CI-server. If a test is "red" we can trust something broke.

However, the issue with "green" tests is, that we actually do not know why it is green: could be data that's valid according to the test case or maybe because there's no data to validate at all. The latter fact would decrease significance of that test (at least in the given context).

Furthermore we're missing metrics of how much input-data is actually covered by the test cases. Looking at the TestCoverageEvaluator this seems usable - though we need some elaboration. It's currently not clear what input is expected.

Request:

Can we figure out on a per-test-case basis if there is data to be tested (before/after test is run)? We could use the "test-skipped" notificaiton of JUnit to provide an overview how many tests are not testing anything.
Could the API of TestCoverageEvaluator elaborated?

jimkont commented 8 years ago

TestCoverageEvaluator was created at the very first beginning of the project and not used since. I updated the code a bit to make not throw errors.

It is still not working correctly but now shows some (wrong) output when you pass -c as a CLI parameter

[INFO  TestCoverageEvaluator] Fdom Coverage: 0.0
[INFO  TestCoverageEvaluator] fRang Coverage: 0.0
[INFO  TestCoverageEvaluator] fDep Coverage: 0.0
[INFO  TestCoverageEvaluator] fCard Coverage: 0.0
[INFO  TestCoverageEvaluator] fMem Coverage: 0.0
[INFO  TestCoverageEvaluator] fCDep Coverage: 0.0

if this is updated to provide the correct numbers it could probably handle your use case. Each metric measures a specific test coverage according to pages 3-4 in http://svn.aksw.org/papers/2014/WWW_Databugger/public.pdf

the lower the metric numbers the fewer cases are actually tested in the input source.

This class was more like a hack to generate table 4 in the paper. What is does (or what I remember it was doing) is

relate RDFUnit patterns to each metric
calculates class & property statistics for the input source
exploit some rdfunit pattern hacks to get the classes / properties / patters associated with each test case and calculate the metrics

It needs some work to get this in a good shape & usable. Let me know if working in this directions covers your goal

jimkont commented 8 years ago

Note that, ideally, the metrics should be identified by doing pattern identification inside the SPARQL queries. This would also work on non pattern-based test cases or pattern-based test cases where the pattern is not associated with coverage metrics. However, this approach handles most cases easily.

AKSW / RDFUnit

Test Coverage Metrics #57