erdogant / bnlearn

Python library for learning the graphical structure of Bayesian networks, parameter learning, inference and sampling methods.
https://erdogant.github.io/bnlearn
Other
463 stars 45 forks source link

Does Inference/Predict model support parallelism during computing? #53

Open Mikcy1595 opened 2 years ago

Mikcy1595 commented 2 years ago

Hi erdogant,

It is a great library. Firstly, thanks for your efforts. I was using bn model to predict a large dataset. I found the time consuming is massive and while the CPU usage is quite low. The bnlearn library does not have inbuilt parallelism function, alternatively I used library 'joblib' to enable all the cores. But it seems not really working. Is it able to intergrate the parallelism computing in bnlearn library? Do you have any suggestions that might help?

Thank you !

image image

erdogant commented 2 years ago

You are right, this function is quite slow as it runs in a simple for-loop. Using your approach will not work for this function. I created a small update and fastend the for-loop slightly. But it may need to be parallelized if you really want to speed it up. If you have any suggestions, let me know.

update with:

pip install -U bnlearn

    dfU_shape = dfU.shape[1]
    for evidence in tqdm(evidences):
        # Do the inference.
        query = bnlearn.inference.fit(model, variables=variables, evidence=evidence, to_df=False, verbose=0)
        # Find original location of the input data.
        # loc = np.sum((dfX==dfU.iloc[i, :]).values, axis=1)==dfU_shape
        loc = np.sum(dfX.values==[*evidence.values()], axis=1)==dfU_shape
        # Store inference
        P[loc] = _get_prob(query, method=method)
erdogant commented 1 year ago

Did you also try the numba package? It requires to import a decorator jit and it will do all the rest.

from numba import njit

# @njit
def predict(model, df, variables, to_df=True, method='max', verbose=3):
...