aantron / bisect_ppx

Code coverage for OCaml and ReScript
http://aantron.github.io/bisect_ppx/demo/
MIT License
302 stars 60 forks source link

Report coverage on .mly #378

Closed tiuno closed 2 years ago

tiuno commented 3 years ago

Currently bisect_ppx can be used to test coverage on a generated parser. While this is already a helpful assessment, I wonder if it could not be possible to inspect the coverage on the mly source file. It could help test more corner cases. Matching the generated parser to source mly by hand is actually hard. Some lines of the generated parser are not a priority when it comes to testing.

aantron commented 3 years ago

Are you looking to see only the code in the preeamble/footer, and code in the grammar cases, or are you looking for coverage of the actual production rules?

tiuno commented 3 years ago

The latter, the point is checking which rules have been covered and which haven't.

aantron commented 3 years ago

Off the top of my head, I think the best we could do is provide coverage for the actions in { }, and try to display it in the original .mly file. However, there are some challenges, and I haven't yet directly looked at .mly output (are you using ocamlyacc or Menhir?). AFAIK, it's hard for a tool running on the .ml file to figure out which code in the .ml file came from preamble and actions, and which is the implementation of the actual parser. Also, for code corresponding to rules (is it code or tables?), AFAIK we don't have locations anymore.

tiuno commented 3 years ago

are you using ocamlyacc or Menhir?

I'm using Menhir

I guess that coverage for actions can give a good insight. That could be a first step and see how useful is can be.

aantron commented 3 years ago

Could you create an example project with your setup, so I can examine the output and see what can be done with it?

tiuno commented 3 years ago

Here you go : mlycoverage; let me know if you want me to remove dependencies.

aantron commented 3 years ago

Thanks, I've looked at the output.

The rules themselves seem difficult to do coverage for. It seems difficult to recover the rules from the .ml output, and the .ml output itself might change across Menhir versions, creating a major potential maintenance problem for Bisect.

Coverage for actions needs two things:

  1. Bisect needs to parse and follow # 123 "foo.mly" directives, know to load the .mly source file, and to include it in the coverage report.
  2. Bisect needs to know how to force a point onto the in-edge of each action (like in cases of pattern matching). Since Bisect is given the .ml file, it has to guess from the .ml file which portions are actions. It looks like all actions are prefixed with # 123 "foo.mly", after a previous section # 456 "foo.ml", so maybe that can be used to find them. However, we've deliberately avoided loading or parsing source files in the instrumenter (the PPX). Perhaps it is possible to recover the actions by analyzing the locations stored in the OCaml AST that the instrumenter sees. Another, longer-term, but most maintainable option might be to ask the Menhir project to emit AST annotations that identify actions, and we can simply detect those annotations.

Empirically, I won't have time to properly do this in the coming months, but I would be very glad to help you (or anyone else) make a PR for this.

aantron commented 2 years ago

Closing this issue for now, as I won't have time to get familiar with Menhir in the near future, and there is no other activity. We can resume this later, though.