PLC-lang / rusty

Structured Text Parser and LLVM Frontend
GNU Lesser General Public License v3.0
223 stars 53 forks source link

Integrate LLVM Code Coverage Instrumentation #1017

Open corbanvilla opened 1 year ago

corbanvilla commented 1 year ago

Is your feature request related to a problem? Please describe. Add LLVM Code Coverage support to Rusty (for use with tools like llvm-cov), so that instrumentation is added to track program execution.

Another developer @HaithemLamri and I are interested in making this happen. We understand that this feature would be pretty complex to implement and not immediately useful for most users, so we’re mostly looking for higher-level support as to how and where this feature could best be integrated in Rusty.

From preliminary research, there seem to be two options for going about this:

  1. Implement the coverage spec from scratch, using LLVM primitive datatypes:
    1. Less future-proof, as the implementation will need to be updated as LLVM updates their code coverage specification (though this historically has not changed very frequently).
    2. This appears to be immediately possibly with only Inkwell primitives, without requiring changes to Inkwell or llvm_sys.
  2. Integrate using LLVM’s CoverageMapping classes:
    1. This would really simplify the code coverage data type generation, since it abstracts a lot of the generation.
    2. It appears that the Rust compiler team tied into these classes for Rust’s integration of LLVM code coverage.
    3. The primary issue with this is that these classes/modules don’t appear to be implemented in Inkwell or in llvm_sys (which Inkwell calls), and would likely have to be integrated into both underlying libraries, before being added to Rusty. This seems like a major challenge.

From an initial assessment, the first option seems to be the quickest and easiest to implement barebones functionality… though the second would likely be better for maintainability in the long run.

**Additional Resources**

ghaith commented 1 year ago

This looks like a great idea. the second strategy you suggest by using the LLVM classes might not be out of the question. Inkwell exposes all the LLVM types using either a LLVMReference trait and the internal-getters feature or a as_mut_ptr method on elements (depending on the version, not sure which version rusty is on, might be the LLVMReference one still, but it might be worth updating) https://github.com/TheDan64/inkwell/pull/392

Also note that Inkwell and llvm-sys only wrap the llvm C api, if something is not in the C api you will need to get a raw handle on the specific objects and branch into the CPP classes.

I'd be happy to review some pull requests with this. I would suggest smaller PRs with that don't break functionality might be the easiest to review. A bigger PR might take a while.

corbanvilla commented 1 year ago

@ghaith thank you for pointing that out about the as_mut_ptr methods.... I didn't realize that! I'll take a look and see if that makes it possible to access everything we would need.

I like the idea of using the CoverageMappingGen classes if possible, since the LLVM IR that clang generates has some more complexity than what's in the specification. It would likely avoid some headaches down the road.