Open Mikcy1595 opened 2 years ago
You are right, this function is quite slow as it runs in a simple for-loop. Using your approach will not work for this function. I created a small update and fastend the for-loop slightly. But it may need to be parallelized if you really want to speed it up. If you have any suggestions, let me know.
update with:
pip install -U bnlearn
dfU_shape = dfU.shape[1]
for evidence in tqdm(evidences):
# Do the inference.
query = bnlearn.inference.fit(model, variables=variables, evidence=evidence, to_df=False, verbose=0)
# Find original location of the input data.
# loc = np.sum((dfX==dfU.iloc[i, :]).values, axis=1)==dfU_shape
loc = np.sum(dfX.values==[*evidence.values()], axis=1)==dfU_shape
# Store inference
P[loc] = _get_prob(query, method=method)
Did you also try the numba
package?
It requires to import a decorator jit
and it will do all the rest.
from numba import njit
# @njit
def predict(model, df, variables, to_df=True, method='max', verbose=3):
...
Hi erdogant,
It is a great library. Firstly, thanks for your efforts. I was using bn model to predict a large dataset. I found the time consuming is massive and while the CPU usage is quite low. The bnlearn library does not have inbuilt parallelism function, alternatively I used library 'joblib' to enable all the cores. But it seems not really working. Is it able to intergrate the parallelism computing in bnlearn library? Do you have any suggestions that might help?
Thank you !