Closed karnatapu closed 5 years ago
You are encountering the exponential explosion in computation time that is common to all Bayesian networks, not just this package. 30 features at 10 states per feature implies 10**30 possible combinations across the joint distribution.
The greedy algorithm is much faster than the exponential algorithm, but it is still slow. The upper limit for the exact algorithm is ~25-28 binary variables. I wouldn't expect the greedy algorithm to be able to quickly handle 30 variables with 10-15 states each. You might try using chow-liu
, which is a greedy tree-building algorithm, if you're looking to shift the balance to faster but less accurate.
I have a data set of 25k rows, 30 features and each feature have 10-15 states. When I run my program using from_samples using greedy search. It is taking more than an hour and still running. Is there any way to tune method? Or am i missing something here.