verilator / verilator

Verilator open-source SystemVerilog simulator and lint system
https://verilator.org
GNU Lesser General Public License v3.0
2.46k stars 585 forks source link

Combinational/Expression Coverage #4677

Open rsammelson opened 10 months ago

rsammelson commented 10 months ago

I am part of a research group working on fuzzing Verilog designs. To this end we need a combinational coverage metric like covered (as well as other commercial simulators). The issue with covered is that it runs more than 30 times slower than Verilator (1-2 minutes vs 2-3 seconds), and the majority of the time is spent re-running parts of the simulation to find intermediate values. For effective fuzzing, we need the coverage calculation to be fast (preferably around 5 seconds) in order to do many mutations.

Because (so far as I know) calculating this metric requires intermediate values that aren't in the VCD file output, but would be known inside the simulator, I think that Verilator is the right place to implement this metric.

I would be happy to assist with the development, but as I've never worked on Verilator I will need a fair amount of help figuring out how to go about adding this, if it is a desired feature.

wsnyder commented 10 months ago

Verilator's own coverage runs much much faster, and seems to have most of this, minus FSM which would be a straightforward add. I'd suggest benchmark that.

Perhaps you don't even need coverage at all nets but only at storage nodes?

Do you need full 64 bit counts? Perhaps you are only interested in the first few counts so can use an e.g. uint8 instead?

rsammelson commented 10 months ago

Verilator's own coverage runs much much faster, and seems to have most of this, minus FSM which would be a straightforward add. I'd suggest benchmark that.

I didn't mention FSM because it is simple enough to efficiently implement as an external program, but I do not think that Verilator currently has combinational coverage. Is this incorrect?

Perhaps you don't even need coverage at all nets but only at storage nodes?

I believe the coverage at non-storage nodes is the problem on why we can't easily get the information from the VCD output.

Do you need full 64 bit counts? Perhaps you are only interested in the first few counts so can use an e.g. uint8 instead?

Technically we only need a bitmask of what was hit and what was not, but it might be easier to implement a u8 counter instead.

wsnyder commented 10 months ago

I do not think that Verilator currently has combinational coverage. Is this incorrect?

True, but not conceptually complicated to implement. If you want good performance you won't want I/O of trace data between multiple programs.

Technically we only need a bitmask of what was hit and what was not, but it might be easier to implement a u8 counter instead.

If so it would be straightforward to use a single bit, with an OR to set the bit, or atomic bit-set between threads. It should be faster than a uint8, as the cache pressure will be better. If you add a optimization pass to smartly allocate the bits, even more so.

rsammelson commented 10 months ago

Alright that's good to know; do you have any advice for how to approach instrumenting every sub-expression? I'm not entirely sure where this should be implemented. I've been looking at how toggle coverage is implemented and I'm a little confused about how it works.

For example a simple expression I'm thinking about is something like &a[2:0] | ^b[1:0], which would result in three things to measure coverage for:

For something like a & b & c (all one bit), processing as (a & b) & c will result in:

So there are six options, but only four actually matter. If we process it as &{a, b, c}, then we will only get four options, so I'm also not sure what the best way to deal with this is. I think the easy way is to treat it as nested binary expressions, and this would be sufficient for what we are trying to do, but I wanted to see if you think there is a better approach.

wsnyder commented 10 months ago

I'd suggest look for research papers on how this has been implemented in the past, feel free to discuss here. The hard part is creating the equation for the coverage point. For performance you'd likely want to remove duplicates (e.g. two equations being covered which result in the same coverage) matching what is done currently with branch coverage (multiple coverage points using same counter.)

If you do want also FSM coverage I'd suggest implement that entirely first, as IMO it will be more popular to use, and I think it's easier to implement and you'll then have the understanding.

rsammelson commented 10 months ago

Alright thanks for the advice, I will probably work on FSM coverage first then.