Closed msmftc closed 3 years ago
Some of my program's indentation was lost during copy-and-paste. Sorry about that. The program is simple enough that you can infer the original indentation.
Hi @msmftc, I've been making improvements to the randomization scheme, and believe that what is present in 0.5.9 results in a good distribution. I'll leave this open for a bit in case you have near-term feedback.
Best Regards, Matthew
@mballance,
This is a good improvement. The original example (above) now produces a good distribution, as do several other tests that I wrote. All parts of the random spaces are sampled.
However, I do have a test where the distribution is unexpectedly skewed. I've included the code and it's coverage results below. In this test the random property "disp" should be evenly distributed across the range [0 - 4095], and also evenly distributed across the range [8192 - 16383]. Instead, the sub-range [8192 - 12291] is sampled about 3x more often than the sub-range [12292 - 16383]. Do you understand why this happens? Is there a reasonable fix to get better distribution?
import vsc
@vsc.randobj
class BranchInstr:
def __init__(self):
self.type = vsc.rand_bit_t(1)
self.disp = vsc.rand_bit_t(22)
@vsc.constraint
def short_offset_cnstr(self):
with vsc.if_then(self.type == 0):
self.disp < 4096
with vsc.else_then:
self.disp >= 8192
self.disp < 16384
def __str__(self):
return(f"type = {self.type}, displacement = {self.disp}")
@vsc.covergroup
class BranchInstr_cg:
def __init__(self):
self.with_sample(item = BranchInstr())
self.type = vsc.coverpoint(self.item.type)
self.disp_cp = vsc.coverpoint(self.item.disp, bins = {
"type0_disp" : vsc.bin_array([16], [0, 4095]),
"type1_disp" : vsc.bin_array([32], [8192, 16383])
})
branchInstr = BranchInstr()
branchInstr_cg = BranchInstr_cg()
for i in range(1024):
branchInstr.randomize()
branchInstr_cg.sample(branchInstr)
# print(branchInstr)
vsc.report_coverage(details=True)
TYPE BranchInstr_cg : 100.000000%
CVP type : 100.000000%
Bins:
type[0] : 481
type[1] : 543
CVP disp_cp : 100.000000%
Bins:
type0_disp[0] : 16
type0_disp[1] : 33
type0_disp[2] : 29
type0_disp[3] : 25
type0_disp[4] : 33
type0_disp[5] : 27
type0_disp[6] : 39
type0_disp[7] : 31
type0_disp[8] : 25
type0_disp[9] : 27
type0_disp[10] : 23
type0_disp[11] : 40
type0_disp[12] : 38
type0_disp[13] : 32
type0_disp[14] : 29
type0_disp[15] : 34
type1_disp[0] : 30
type1_disp[1] : 34
type1_disp[2] : 28
type1_disp[3] : 32
type1_disp[4] : 23
type1_disp[5] : 20
type1_disp[6] : 26
type1_disp[7] : 19
type1_disp[8] : 24
type1_disp[9] : 26
type1_disp[10] : 35
type1_disp[11] : 21
type1_disp[12] : 34
type1_disp[13] : 32
type1_disp[14] : 23
type1_disp[15] : 28
type1_disp[16] : 8
type1_disp[17] : 4
type1_disp[18] : 8
type1_disp[19] : 3
type1_disp[20] : 5
type1_disp[21] : 9
type1_disp[22] : 7
type1_disp[23] : 9
type1_disp[24] : 8
type1_disp[25] : 2
type1_disp[26] : 8
type1_disp[27] : 4
type1_disp[28] : 8
type1_disp[29] : 9
type1_disp[30] : 11
type1_disp[31] : 5
I'm guessing it is not ordering the solve order here like you would expect. So half the time it solves disp first and always uses the higher range since type is unsolved. The other half of the time it does the coin flip. This would lead to the 1:3 ratio mentions.
This is just speculation, but maybe specifying the solve order would help fix it
@qzcx, I ran further experiments with explicit solve order, but that did not change the skewed distribution. I tried both vsc.solve_order(self.type, self.disp)
and vsc.solve_order(self.disp, self.type)
.
@mballance, I think you have fixed the original issue, even if there is still a lesser issue that causes skewed distributions. You can close this issue if you wish.
@qzcx, I'll file a separate issue for the new testcase, since it is worth digging into. I understand why the distribution is skewed, and can make a meaningful improvement to the results. That said, I'm still thinking whether there's a more-general approach that might better.
In case you're interested, here's an explanation of what's happening and what I changed. The new randomization-swizzling code introduces randomness by constraining bit ranges to be equal to the corresponding bits in a random value. Up to 6 ranges are formed. Any value-range constraints that conflict are dropped (ie they're soft constraints).
In the example, the 'dist' field is 22 bits wide. With 6 ranges, there are 5 ranges with 3 bits and one range with 7 bits. The initial algorithm always placed the 'overflow' range on the upper bits. In this case, that's the range that is most likely to not match the randomly-selected value. In that case, the constraint is dropped and Boolector will typically select '0' for those bits.
I've experimenting with randomly placing the 'overflow' range on the upper bits and on the lower bits. This results in an improved distribution on this example, but I think there might be a more-general approach.
Thanks again for raising this case!
Below is an example program that constrains the property "disp" to the range [0, 4096]. However, the generated value is always exactly 4096. This seems related to the if/then/else constraint. If I remove if/then/else and just constrain that disp <= 4096, then the program generates a reasonable distribution.
Now, without the if/then/else:
My PyVSC version is pyvsc-0.5.5.20210822.1