I made this change in README.md to clarify how to maintain reproducibility (might be useful for paper publication, hyper parameter optimization, and debugging).
Control tree random seed
Even with same data, a tree (also a forest) generated from rrcf.RCTree() is subject to np.random and might change for every run (resulting in different tree shape and anomaly score). To maintain reproducibility, use numpy.random.seed():
# Before making a tree or forest
seed_number = 42 # your_number
np.random.seed(seed_number)
tree = rrcf.RCTree(X)
I made this change in README.md to clarify how to maintain reproducibility (might be useful for paper publication, hyper parameter optimization, and debugging).
Control tree random seed
Even with same data, a tree (also a forest) generated from
rrcf.RCTree()
is subject tonp.random
and might change for every run (resulting in different tree shape and anomaly score). To maintain reproducibility, usenumpy.random.seed()
: