Coverage Report: Various keywords not being marked correctly in coverage

nickbattle commented 10 years ago

Several keywords/operators are not being marked in green even though they are clearly executed. Feel free to add those you discover to this issue. Here are a few...

The opening brace of a set enumeration
The RHS of an assignment statement
An "and" operator
An open bracket for a block statement
RHS of a ">" operator

Screen shot attached.

nickbattle commented 10 years ago

coverage

ldcouto commented 9 years ago

Arise, ancient bug report, arise!

@nickbattle , we have some students working on the code coverage plug-in. They can maybe take a look at this as part of their work but they might need guidance. Do you know what the cause of this is?

nickbattle commented 9 years ago

The key thing to understand is that coverage is based on the LexLocation type and its hit count, and start/end marks. The hit count is usually set when the Breakpoint check method is called. And that in turn is called when elements of the AST are encountered when the ExpressionEvaluator (or StatementEvaluator etc) is walking over the tree.

The problems occur when syntactic elements exist which really ought to be marked on the coverage output, but which don't have LexLocations that are recorded in the AST or which aren't "executed" in the sense that they don't have Breakpoints at their location. The fix, in some cases, is to find the LexLocation and call its "hit()" method directly when the execution passes by. But I think in other cases, the AST simply does not record the LexLocation. I seem to remember the closing brace of a set enumeration was like this for example - the token is parsed, and the location of the whole expression is set to the location of the opening brace. But no one records the position of the closing brace. Therefore it always shows as "red".

There are some more subtle cases, where we deliberately don't want to trap into a breakpoint. For example if you have "left and right", it is a real pain to have a breakpoint on the "and". That would mean that single stepping would jump into the middle of the expression (think of a long list of "and" clauses over several pages), then it would leap back to the "left" expression, then it would leap forward to the "right" expression. It's must easier to debug if you only break on the left and right. But naively, that means that the "and" LexLocation is not hit. So the execution of caseAAndBooleanBinaryExp explicitly calls hit() but has a commented-out call to check for a breakpoint.

So I think it's just a case of tediously checking for these cases when we spot them and adding calls to hit(). In the case where there is a missing LexLocation to mark, I suppose we would extend the AST to provide that extra location somehow.

ldcouto commented 9 years ago

Hmm...

Doesn't the entire AndExp have a lexlocation? It should have a start and end point and we should just paint everything in the middle. You'd have to ensure that both branches were hit but wouldn't that work?

A lot of the stuff I've seen for code coverage seems to paint bigger units than lex locations (like statements, conditions, etc.). Do you know what the inspiration for location based marking was?

--ldc

nickbattle commented 9 years ago

Yes, the AndExp has a LexLocation, and its start-end span the "and" characters. So we want to mark it as hit when it is executed. But we DON'T want to set a breakpoint there. So we can't use the usual Breakpoint.check() method to hit the location. We have to hit it directly.

Unless it's changed since VDMJ days, the coverage only ever deals with LexLocations. Of course, if you cover all of the LexLocations of the tokens in (say) a block, it might look like an entire block is covered. But I think it's really just a set of individual locations.

It may be that this has been changed since VDMJ though... not sure.

ldcouto commented 9 years ago

I don't think it has ever been changed. I see that there are some cases of locations being hit outside the breakpoint check -- for example: the assignments in the atomic statement. I suppose calling hit() where it's missing is the right way to go but it's not much of a student project to just go through the interpreter and tweak a bunch of cases :P

I guess the thing we need to consider as well is: if part of a composite expression is hit, do we count the entire expression as hit?

In the AndExp example, if we only ever hit the left subnode, does the whole thing still count as hit? It'm inclined to think so and the second phase of the coverage reporting would then work out if it should color the thing green or yellow or something else, based on whether all sub-nodes were hit or not. Does that make sense?

nickbattle commented 9 years ago

I agree that looking for missing hit() calls is not much of a project.

I'm not sure about marking entire expressions as "hit" if they are only partially executed. I can see it makes some sort of sense for simple one-line expressions, but what if there were very large "let clauses" all or-ed together in a postcondition? In that case, I think I would want to know that I had "hit" (ie. actually tested) every clause separately rather than knowing that one of them (perhaps only ever the first one) had been tested.

A more complex project would be path coverage (as opposed to simple statement/expression coverage). You have to watch out for combinatorial explosions, but it's an interesting area.

ldcouto commented 9 years ago

I think there are two sides to any of this coverage stuff. One is the recording of the hit counts during execution. Efficiency is the main concern here since we don't want to make interpretation slow. We should only record hit counts that we absolutely need. Anything that can be derived from existing counts, we can just skip. So, could we derive the coverage of an AndExp from the coverage of its branches?

Once we have the hit counts, they can be combined with the AST to do interesting things. This project will probably look at condition coverage and stuff like MC/DC.

I think the students might be willing to fix a couple of hit count issues here and there but it really cannot be the focus.

nickbattle commented 9 years ago

When I put in the breakpoint system, I was worried that checking at every node to see whether there is a breakpoint active would be very slow. But in fact it isn't that bad - and that's a lot more complicated that calling a hit() method just to increment a counter (may even be inlined). If we're not careful, we could spend more CPU time deciding whether to count something than the naive counting would anyway!

Not sure what you mean by MC/DC.

ldcouto commented 9 years ago

Are you suggesting we start moving hit() calls out of the breakpoint()?

When I say that we need to work out whether a composite construct is covered, that's a post-execution analysis. So ideally you execute the model (with coverage enabled) and afterwards when you open the coverage file (or click some calculate coverage button), this stuff is all worked out.

MC/DC is a code coverage criteria used in avionics and such. It tries to ensure that every possible condition in the code is both tested and is impacting the control flow of the program.

http://en.wikipedia.org/wiki/Modified_condition/decision_coverage

On the subject, if you've ever used a code coverage tool that does a good job reporting data to the user, let me know. We're still in the idea gathering phase.

nickbattle commented 9 years ago

No no, in general it's convenient to have the hit() call in the Breakpoint.check() method, because that method gets called for almost every node that is executable. It's just that in some cases we don't want to check for a breakpoint on a node (that was the discussion about a long chain of "and" clauses), but we do still want to mark the nodes has being hit. So in cases like that, hit() is called outside the breakpoint check (it replaces it).

The only serious code coverage tool I've used is Clover, integrated with Maven builds. That produces HTML reports of the coverage as well as a load of statistics.

ldcouto commented 9 years ago

Right, I think we're on the same page. There are definitely some places where hit() should be called.

But when you're doing the painting of the coverage by combining the hit count data with the AST, you can paint composite nodes if all their sub-nodes have been hit. So, if e1 and e2 have been hit you can paint the and in e1 and e2 even if it has no hit counts.

I think it's preferable to make this kind of calculation after interpretation than during.

nickbattle commented 9 years ago

OK, I see your point.

nlmave commented 9 years ago

For C/C++ we have used Bullseye quite successfully at Chess; it has great reporting features, see http://www.bullseye.com/measurementTechnique.html

On 24-3-2015 15:07, Luis Diogo Couto wrote:

Are you suggesting we start moving |hit()| calls out of the |breakpoint()|?

When I say that we need to work out whether a composite construct is covered, that's a post-execution analysis. So ideally you execute the model (with coverage enabled) and afterwards when you open the coverage file (or click some |calculate coverage| button), this stuff is all worked out.

MC/DC is a code coverage criteria used in avionics and such. It tries to ensure that every possible condition in the code is both tested and is impacting the control flow of the program.

http://en.wikipedia.org/wiki/Modified_condition/decision_coverage

On the subject, if you've ever used a code coverage tool that does a good job reporting data to the user, let me know. We're still in the idea gathering phase.

— Reply to this email directly or view it on GitHub https://github.com/overturetool/overture/issues/273#issuecomment-85509605.

peterwvj commented 9 years ago

Luis, How developed is the new code coverage plugin? Could it be integrated into Overture and possible replace the existing one (eventually)?

ldcouto commented 9 years ago

I think it's ok.

But the metric it reports is different so I'm not sure it can exactly replace the current one.

On 19 September 2015 at 10:25, Peter Tran-Jørgensen < notifications@github.com> wrote:

Luis, How developed is the new code coverage plugin? Could it be integrated into Overture and possible replace the existing one?

— Reply to this email directly or view it on GitHub https://github.com/overturetool/overture/issues/273#issuecomment-141635661 .

overturetool / overture

Coverage Report: Various keywords not being marked correctly in coverage #273