jmschrei / pomegranate

Fast, flexible and easy to use probabilistic modelling in Python.
http://pomegranate.readthedocs.org/en/latest/
MIT License
3.38k stars 590 forks source link

[QUESTION] Is structure learning from sample data supported in the latest version? #1100

Closed williamagyapong closed 4 months ago

williamagyapong commented 6 months ago

Hello,

In version 0.14.8, it was possible to learn a Bayesian Network structure from sample data using the from_samples method. However, in version 1.0.4, I am unable to access this method from the network instance. If this feature is still supported, could you please advise on how to access it? If it is not currently available, I kindly request that it be included in a future update.

Thank you!

jmschrei commented 6 months ago

You can use the fit method to either fit parameters given a structure, or to learn the structure and parameters jointly if no structure is provided.

williamagyapong commented 6 months ago

Thank you very much for the quick response. I am finding it tricky to rewrite the following code, which is compatible with version 0.14.8, using the new API in version 1.0.4 with the fit method. In this code, source, and target are tabular datasets:

` import pandas as pd from pomegranate import BayesianNetwork

bn = BayesianNetwork.from_samples(source, algorithm="greedy", n_jobs=1) sample = pd.DataFrame(bn.sample(n=len(target), algorithm="rejection"), columns=source.columns) ` Any guidance on how to adapt this code to the new version would be greatly appreciated.

jmschrei commented 4 months ago

Sorry for the late reply.

You need to run BayesianNetwork().fit(X) to do structure learning, where X is a table of integers that range from 0 to n_keys. You will have to convert your data to integers on your end, potentially using one of the scikit-learn functions.

Once you have the fit model you can run model.sample(n=n) where n is the number of samples you'd like.

Note that neither the inputs nor the outputs from the model can be pandas DataFrames. Everything is done on PyTorch tensors.

Please re-open if you continue to have issues.