fvutils / pyvsc

Python packages providing a library for Verification Stimulus and Coverage
https://fvutils.github.io/pyvsc
Apache License 2.0
113 stars 26 forks source link

Support for Random Seed #53

Closed aneels3 closed 3 years ago

aneels3 commented 3 years ago

Currently, I am experiencing some issues/errors in a random fashion. Mostly constraint failure when running regressions. It would be really great If you can provide a method/switch to use the seed. It helps me to debug the issue efficiently and precisely.

Regards, Anil

mballance commented 3 years ago

Hi @aneels3, PyVSC currently uses the seed from the Python random package as the sole seed. At some point, this will likely need to be extended to support multiple contexts (eg a seed per class instances). That's beside the point though.

You can set the seed by setting the seed of the 'random' package. Here is how the tests do it: https://github.com/fvutils/pyvsc/blob/3f58e3a42616edd95db676cdc149dcd38fb241f1/ve/unit/vsc_test_case.py#L7-L16

Best Regards, Matthew

aneels3 commented 3 years ago

Hi @mballance The above approach works well for normal examples, not for class-based tests. I have written a test that works for class-based programs.

from random import Random

class my_class:
    def __init__(self):
        self.a = 0
        self.b = 0
        self.my_random_ins = Random(10) # Add seed as argument

class my_class_child(my_class):
    def __init__(self):
        super().__init__()
        self.c = 0
        self.d = 0

    def calculate(self):
        self.c = self.my_random_ins.randint(1,10)  
        self.a = self.my_random_ins.randint(1,100)  

obj = my_class_child()
for i in range(10):
    obj.calculate()
    print("a = ", obj.a, end="")
    print("c = ", obj.c)

I believe PyVSC uses the random module to randomize the class fields. https://github.com/fvutils/pyvsc/blob/3f58e3a42616edd95db676cdc149dcd38fb241f1/src/vsc/model/randomizer.py#L65 Is it possible to add the Command line option over here such that the user can use the different seeds while running the test?

Regards, Anil

mballance commented 3 years ago

Hi @aneels3, Yes, PyVSC does use the random module in randomizing the class fields. By default, an instance of the Random class uses the global seed. If the user only needs to set a single master seed, this can be done by initializing the global Random seed. In your application, you could add an option to the test-generator program that would initialize the global Random seed. It sounds, though, like you're looking for a way to change the seed during a single run. Is that correct? In other words, have the ability to run randomizations against several seeds during a single Python run. If that's the case, then we might need a way to specify the randomization source for a given class instance. Please confirm if this is what you're looking for, and I'll give it more thought.

Thanks, Matthew

aneels3 commented 3 years ago

Hi @mballance I have modified the test to see the behaviour for multiple classes with the same seed for a diffrent Random instance. It seems like the randomized results are the same in both cases.

from random import Random

class my_class:
    def __init__(self):
        self.a = 0
        self.b = 0
        self.my_random_ins = Random(10) # Add seed as argument

class my_class_child(my_class):
    def __init__(self):
        super().__init__()
        self.c = 0
        self.d = 0

    def calculate(self):
        self.a = self.my_random_ins.randint(1,20)  
        self.b = self.my_random_ins.randint(1,200)  

class my_new_class:
    def __init__(self):
        self.a = 0
        self.b = 0
        self.rng = Random(10)

    def calculate(self):
        self.a = self.rng.randint(1,20)  
        self.b = self.rng.randint(1,200)

obj = my_class_child()
obj1 = my_new_class()
for i in range(5):
    obj.calculate()
    obj1.calculate()
    print("obj a = ", obj.a, end="")
    print(" obj b = ", obj.b)
    print("obj1 a = ", obj1.a, end="")
    print(" obj1 b = ", obj1.b)
    print("\n")

The output of this test is: image

For your query The first point you said is what I am expecting. I want to set a master seed for a single run and use that seed when I need to generate the same results later (If required). I believe the randomization is done for pyvsc rand type variables using object.randomize(), so If you can provide a way to pass the master seed in randomize() method such that all the variable associated with it will generate the same results, or maybe some other functions like vsc.seed() to set the global master seed.

I have initialized the global seed using random.seed(3) in our main run module of the test generator but the results are not the same. Can you help me with this?

Thanks and Regards, Anil

mballance commented 3 years ago

Hi @aneels3, Hmm... I've been playing with this and it seems I don't really understand how Python's random package works. I can certainly see the behavior you report -- that you get different results after specifying the same seed via random.seed(). For now, I've reverted to always using the global random instance. This appears to work reliably for me. Please give the latest code a try and let me know.

Best Regards, Matthew

aneels3 commented 3 years ago

Hi @mballance Thanks for the update! I tried with the latest pyvsc code and the results are the same as before. I didn't see any difference. Can you able share the code snippet which you have tried? for reference! I might be doing wrong at my end.

Thanks

mballance commented 3 years ago

Hi @aneels3, Of course. I was using the following script and running interactively:

import vsc
import random

random.seed(100)

@vsc.randobj
class my_c(object):
    def __init__(self):
        self.a = vsc.rand_uint8_t()
        self.b = vsc.rand_uint8_t()

    @vsc.constraint
    def ab_c(self):
        self.a != self.b

my_i = my_c()
for i in range(5):
    v = random.randrange(10)
    my_i.randomize()
    print("v=%d a=%d b=%d" % (v,my_i.a,my_i.b))

Do note that the random.seed() call needs to be prior to the first randomization call for PyVSC.

Best Regards, Matthew

aneels3 commented 3 years ago

Hi @mballance I have been trying my best to understand this. Since it is working fine with the test case but not in our generator. I have tried one more complex example and it works fine there too.

import vsc
import random
from enum import IntEnum, auto

random.seed(100)

class data_pattern_t(IntEnum):
    RAND_DATA = 0
    ALL_ZERO = auto()
    INCR_VAL = auto()

@vsc.randobj
class my_base_test(object):
    def __init__(self):
        self.a = vsc.rand_uint8_t()
        self.b = vsc.rand_uint8_t()

    @vsc.constraint
    def ab_c(self):
        self.a != self.b

@vsc.randobj
class my_asm_gen(my_base_test):
    def __init__(self):
        super().__init__()
        self.c = vsc.rand_uint8_t()
        self.d = vsc.rand_uint8_t()

    @vsc.constraint 
    def child_con(self):
        self.a > self.b

@vsc.randobj
class my_stream(my_asm_gen):
    def __init__(self):
        super().__init__()
        self.e = vsc.rand_uint8_t()

    @vsc.constraint 
    def child_con(self):
        self.c > self.d

@vsc.randobj
class my_seq(my_stream):
    def __init__(self):
        super().__init__()

    @vsc.constraint 
    def child_con(self):
        self.e > self.d

@vsc.randobj
class gen_config():
    def __init__(self):
        self.instr_count = vsc.rand_int32_t()
        self.data_page_pattern = vsc.rand_enum_t(data_pattern_t)

    @vsc.constraint
    def default_c(self):
        self.instr_count in vsc.rangelist(vsc.rng(10, 1000))

my_i = my_seq()
gen_obj = gen_config()
gen_obj.randomize()
print("Gen Config is Randomized")
print("Instr_count: ", gen_obj.instr_count)
print("data_page_pattern: :", gen_obj.data_page_pattern)
for i in range(5):
    my_i.randomize()
    print(" a = %d b = %d" % (my_i.a,my_i.b))

For my case, I am adding the random.seed(100) here in base_test class, which is the start of the whole program/generator.

But the results are not promising. For now, I am checking the generated gen_table in the log file, where we have most of the rand type variables like main_program_instr_cnt, data_page_pattern, mstatus, mie, sstatus, sie etc. These variables are all different when I am running the test multiple times.

Can you take a look for once for me? You can run the test a few times with fewer instructions to see the problem.

Thanks and Regards, Anil

mballance commented 3 years ago

Hi @aneels3, While investigating the performance issue, I did note that the results changed across different runs. So, there is an issue here. Looking at the pygen sources, I don't see where the seed is set. Can you point me to that?

Thanks, Matthew

aneels3 commented 3 years ago

Hi @mballance Yeah, It's not been added yet. I have just added it here in a separate branch. Also, I have set the instruction count to 100. Please check it once.

Thanks, Anil

mballance commented 3 years ago

Hi @aneels3, I've gone through the randomization code looking for places where use of set/map may have been adding random instability. I believe I can see this have a positive effect when running the riscv-dv generator (with the addition of a fixed call to set the random seed). Before the changes, I could see that the number of instruction-streams randomized during a given run would change across runs. Now, the number if constant. Please give this a try and see if it resolves the random-instability for you as well.

Thanks, Matthew

aneels3 commented 3 years ago

Hi @mballance I am still getting different results upon multiple runs. Did you use the same branch to verify this?

mballance commented 3 years ago

Hi @aneels3, I was running with the mainline version of riscv-dv with some manually-added 'seed' statements. I've tried out your branch, and can see the differences in output. I think there are two issues: one easy one, and one that I'll need to investigate further.

I see your initial 'seed' call at the top of riscv_instr_base_test.py. However, it appears that you also need to seed each process created using multiprocessing. Prior to adding seed-initialization in 'run_phase', I was seeing very-significant differences in output -- different numbers of instructions, etc.

    def run_phase(self, num):
        random.seed(100)
        self.randomize_cfg()
        self.asm = riscv_asm_program_gen()
        riscv_instr.create_instr_list(cfg)
        if cfg.asm_test_suffix != "":
            self.asm_file_name = "{}.{}".format(self.asm_file_name,
                                                cfg.asm_test_suffix)

After adding this, the output is significantly less-random (same number of instructions), but there are still differences. I'll continue to investigate, but I suspect there is a set/dict introducing randomness somewhere.

aneels3 commented 3 years ago

Hi @mballance Thanks for the feedback on this.

mballance commented 3 years ago

Hi @aneels3, I've just published release v0.5.6 which I believe resolves the remaining random-instability issues. When I run pygen from your branch, along with the per-process random.seed(), I receive the same result on multiple runs. Please try the latest release out and confirm whether you also see consistent results.

Best Regards, Matthew

aneels3 commented 3 years ago

@mballance Thanks a lot for the effort. Your recent changes indeed solved the random-instability issue. Thanks again.

Closing the issue.

Regards, Anil