Open ceteri opened 3 years ago
Hey @ceteri,
I need some pointers to understand this requirement better.
Thanks in advance.
Thank you @Ankush-Chander! Here's an idea, if this seems reasonable as an approach?
There are several kinds of modeling, sampling, and inference implemented by pgmpy
, although probably our shortest path is for focusing on Discrete Bayesian? This is also one of the top-requested features to add to kglab
from our ongoing survey.
Next steps are:
pgmpy
which produces known results – which we can use to verify the integration later
kglab.KnowledgeGraph
or probably even better for kglab.Subgraph
that loads the pgmpy
model data from the KGWe can also decide whether to have some additional wrappers for pgmpy
and its results. On the one hand, it's great to wrap results into pandas
dataframes and other conveniences for data science workflows. On the other hand, it's probably better to allow people to simply use pgmpy
operations on the model directly. The latter approach is how we've handled integration of PyTorch, PyVis, etc., i.e., not to intermediate unless there are pain points that need to be corrected (as in SPARQL queries).
How does that sound as an approach?
Hey @ceteri
I tried to follow above trail but I was not able to find any widely accepted standard rdf representation of bayesian networks. Will need your help in that.
Once we pinpoint that we can provide user a pathway to move from a standard bn rdf file
to kg
to pgmpy model
. Rest of the operation can be done directly using pgmpy endpoints.
Thanks
Hi @Ankush-Chander, good point! The way I described it above, moving from RDF => pgmpy
wouldn't work directly, and there's not standard representation.
What I should have described better:
pgmpy
, so we have a known baseline to test againstSubgraph
classes to transform into pgmpy
If the selected example problem can involve the "progressive example" of recipes used in the tutorial, that would be ideal. Although that's not necessary first for us to build out an integration. The initial test case should be simple, as the priority. We can always construct recipe examples later :)
Does that describe the problem better?
The intention for this is to illustrate how to use a completely different graph technology (Bayesian networks) on graph data, which can complement the other approaches we have with NetworkX
, RDFlib
, pslpython
, PyTorch
, etc.
Many thanks, Paco
Hey @ceteri,
Took a while to get my head around Bayesian inferencing.
Here"s the test example.
P.S: Original cancer model although simple made some very gloomy assumptions, so I had to choose something positive :) I hope it"s simple enough for our purpose
3. At that point, I'll represent in RDF (as idiomatic as possible; this becomes simpler after RDF-star is available)
Any pointers on step 3 will be helpful for me to continue.
Thanks in advance, Ankush
Wonderful, thank you @Ankush-Chander !
Now I get to wrangle with some RDF representation, hopefully with not too much reification required :)
Integrated
pgmpy
for statistical inference in Bayesian networks.Depends on: #26