Closed walido78 closed 2 years ago
Hi @walido78, this is interesting. I can certainly see the difference from your profiles. I setup a small testcase (see below). The strange thing is that this testcase shows the opposite: 64 wildcard bins in a single coverpoint are more efficient than 64 individual coverpoints. https://github.com/fvutils/pyvsc/blob/79e1b675fdf61f2c57f65d1a847f5de28c917936/ve/unit/test_coverage_wildcard_bins.py#L134-L233
Looking more deeply at the profiles, it appears that accessing the whole value of addr in cocotb takes more time than accessing individual bits of addr. Note, for example, the calls to binary.py:37(resolve) and binary.py:396(binstr) that only appear in the 'wildcard' version of the test.
I do also see an area where PyVSC could help. From the profile, it appears that PyVSC is fetching the coverpoint value each time it is sampled. It should be possible for PyVSC to cache the sampled value and reuse it. This would at least minimize the overhead imposed by cocotb fetching the full value of signals.
Hi @walido78, I've released a new version of PyVSC (0.7.6) that implements per-coverpoint caching of the coverpoint target-expression value. Previously, the target value would be computed each time a bin in the coverpoint was sampled. Now, the target value is sampled once per coverpoint regardless of how many bins are in that coverpoint. I'll be interested to see how the performance changes for you. Unless cocotb is >64x slower fetching the value of 'addr' vs fetching a single bit, your single-coverpoint version should be faster than the 64-coverpoint version now.
Hi @mballance , I actually tried with the new version and it's wayyy faster than I expected ! The profiling takes 9 seconds instead of 27 seconds ! Thank you very much
Cyc 0015219: INFO **************************************Profiling ****************************************
12460588 function calls (12384113 primitive calls) in 9.334 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
65543/44550 0.085 0.000 8.114 0.000 scheduler.py:330(react)
44548 0.474 0.000 8.045 0.000 scheduler.py:355(_event_loop)
71689/65543 0.248 0.000 7.174 0.000 scheduler.py:744(schedule)
71689/65543 0.065 0.000 5.735 0.000 decorators.py:137(_advance)
71689/65543 0.036 0.000 5.671 0.000 outcomes.py:35(send)
71689/65543 0.035 0.000 5.637 0.000 {method 'send' of 'coroutine' objects}
7425 0.046 0.000 4.308 0.001 sampler.py:23(sampler)
22272 0.069 0.000 3.884 0.000 coverage.py:114(sample)
44544/22272 0.193 0.000 3.747 0.000 covergroup_model.py:64(sample)
103936 0.341 0.000 3.435 0.000 coverpoint_model.py:185(sample)
1395712 0.284 0.000 1.799 0.000 coverpoint_model.py:225(get_val)
51968 0.035 0.000 1.516 0.000 expr_ref_model.py:33(val)
950272 0.663 0.000 1.373 0.000 coverpoint_bin_single_wildcard_model.py:27(sample)
311808 0.246 0.000 1.212 0.000 coverpoint_bin_single_bag_model.py:75(sample)
118920 0.750 0.000 1.202 0.000 stagemanager.py:55(set_stage_name)
4609 0.007 0.000 0.994 0.000 decorators.py:257(_advance)
67849 0.167 0.000 0.991 0.000 scheduler.py:524(_resume_coro_upon)
4609 0.021 0.000 0.978 0.000 calc1_tb.py:105(test_cmds)
61440 0.102 0.000 0.934 0.000 handle.py:718(value)
Excellent, @walido78! Thanks for sharing the updated results!
Hello, As per requested, I am opening an issue related to using wildcard bins performance impact. Actually, using this covergroup @vsc.covergroup
After profiling I get this performance:
And by changing the previous covergroup with individual bins which is behaving the same way but in different coverpoints instead of having the same one :
I get this performance :