anhaidgroup / py_entitymatching

BSD 3-Clause "New" or "Revised" License
184 stars 48 forks source link

Enhan down sample seed #78

Closed pjmartinkus closed 6 years ago

pjmartinkus commented 7 years ago

A seed variable has been added to the down_sample in order to add seed functionality.

The seed is set as a default to None. An if statement first checks if seed is None. If not, it creates a new RandomState with the user's seed. Otherwise it creates a new RandomState without a seed. The choice function is now called from the new RandomState variable instead of np.random. This allows the tuples from table A to be selected with the seed.

Next, a seed variable, with default value None, was added to _probe_index to allow the seed functionality for selecting tuples from table B. Here, an if statements checks if the seed variable is not None and if the check is passed it calls the function random.seed with the user's seed as the argument. This function sets the seed for when random.randint is called next to select tuples from table B.