ML-KULeuven / problog

ProbLog is a Probabilistic Logic Programming Language for logic programs with probabilities.
https://dtai.cs.kuleuven.be/problog/
297 stars 34 forks source link

InconsistentEvidenceError for evidence with non-beta distributions + support for "nonbinary evidence"? (dcproblog_develop) #39

Open shuvrobiswas opened 4 years ago

shuvrobiswas commented 4 years ago

Hey guys,

Playing around with the dcproblog_develop branch of this repo / continuous variables. The following snippet leads to an 'InconsistentEvidenceError'.

b ~ normal(0, 1).
c :- b > 1.
C::event(N):- C is c.

evidence(event(1), true).
evidence(event(2), false).
evidence(event(3), true).
evidence(event(4), true).

query(event(5)).

The idea here is to update the parameters of b (a normal distribution) based on observations or evidence of an event which is tied to b being greater than 1 - not sure if this is supported or I am using the wrong syntax (i.e. I would imagine b & c are different types above)?

I see the following example in the repo which is similar (with the idea being: use a beta distribution which has the characteristic that it's domain is from 0 to 1, to model the probability of an outcome / can we adapt this such that 'B' is coming from functions of other probability distributions?):

b~beta(0.5,1).
B::coin_flip(N):- B is b.

evidence(coin_flip(1), true).
evidence(coin_flip(2), false).
evidence(coin_flip(3), true).
evidence(coin_flip(4), true).

query_density(b).
query(coin_flip(5)).

Further, do you guys anticipate supporting evidence that is non-binary, i.e. something along the lines of:

b ~ normal(0, 1). % starting parameters / assumptions for b

evidence(b, 0.5). % i.e. observing that the value of b is a particular value
evidence(b, 0.75). % and updating the parameters of b

^ I know plenty of other probabilistic programming frameworks are out there for this, but would be amazing to have this ability tied in with all the logical aspects of problog / hal_problog, unifying standard probabilistic programming with everything you guys have atm.

Thanks, Shuvro

shuvrobiswas commented 4 years ago

Post holiday bump, happy 2020 y'all!

VincentDerk commented 4 years ago

Happy 2020 to you too!

Pedro is working on the DC branch but I can tell you the source of your error.

I believe you are wrongly using the is/2 in C::event(N) :- C is c. c is true or false and not a number which I believe it expects. The InconsistentEvidenceError is thrown because the probability that the evidence is true is found to be 0. You can check this by querying any event without any evidence. It is 0 because the grounding step resulted in an empty theory. I would actually expect the grounder to throw an ArithmeticError like the main ProbLog branch and SWI-Prolog do for

c :- 5 > 1.
event(N):- C is c.
query(event(5)).

(https://dtai.cs.kuleuven.be/problog/editor.html#task=prob&hash=43fd0d22606963fae76dd22f8088f9fe) (@pedrozudo)

Your second question on support for evidence that is non-binary can best be answered by @pedrozudo .

VincentDerk commented 4 years ago

Is it something like this you want?

c~beta(0.5,1).
c2~beta(0.7,1).
C::event(N):- C is c, c > 0.5.
C::event(N):- C is c2, c =< 0.5.

evidence(event(1), true).
evidence(event(2), false).
evidence(event(3), true).
evidence(event(4), true).
query(event(5)).

Using the other distribution if some condition is satisfied.

shuvrobiswas commented 4 years ago

Hey @VincentDerk thanks for getting back so soon.

"c is true or false and not a number which I believe it expects"

Understood, in this case what I was looking for is c to be the number representing the probability: P(b>1)

So in that case, the question is if there is some syntax to assign to c the numerical value for P(b>1):

b ~ normal(0, 1).
c :- b > 1.                   # maybe something like    c:- P(b > 1).

Regarding the example you give, it is similar to the 2nd code snippet I shared in the original question, and is structurally different from the situation above.

VincentDerk commented 4 years ago

To obtain the probability within the program you need to use subquery/2 (or subquery/3 if you want to give evidence).

Here the probability of event(N) is set to the probability that b > 1:

b~beta(0.5,1).
c :- b > 1.
Prob::event(N):- subquery(c, Prob).
evidence(event(1), true).
...
query(event(5)).

To obtain a conditional probability you can use subquery/3 like in this or this example. Unfortunately it seems that subquery/2 (and /3) is not quite ready yet in that branch. The correct semiring is not passed on in the evaluation and the normal Probability Semiring can not handle continuous variables. I opened a bug issue #40 .

shuvrobiswas commented 4 years ago

Thanks for clarifying @VincentDerk

And yeah it looks like using subquery leads to the following issue: "AttributeError: 'LogicFormula' object has no attribute 'get_density_name'"

I'll wait for this issue to be resolved before proceeding: https://github.com/ML-KULeuven/problog/issues/40

You guys have built a really interesting framework for bringing in domain knowledge into probabilistic programming in a principled / declarative way. There are so many applications of this I can see :)

pedrozudo commented 4 years ago

Hi Shuvro,

About your question to condition on continuous random variables. I guess you are looking for something along the lines in this examples: https://github.com/ML-KULeuven/problog/blob/dcproblog_develop/problog/tasks/dcproblog/test/pyro/examples/balls_observation.pl

I introduced a new predicate 'observation/2' (instead of 'evidence/2') to differentiate the conditioning on Boolean random variables and continuous random variables (semantically it is not the same).

Let me know if this does the trick.

shuvrobiswas commented 4 years ago

Hi @pedrozudo thanks for your response.

I believe this example does not address the issue. It is similar to the 2nd example in the original question, which works (i.e. we're using a beta distribution to model the prior probability of something).

The simplest way of describing what I'm looking for maybe with an example:

b ~ normal(0, 1).  # b is a normal distribution with starting params: mean 0, std 1
c :- b > 1.
# C::event(N):- C is c.

observation(b(1), 1).  # we observe an instance of b, b==1
observation(b(2), 2).  # we observe further instances of b
observation(b(3), 2).
observation(b(4), 2).

# the posterior of b now has mean: x, std: y, different from the original based on the observations above

# alternatively, evidence of the following form is also fine: evidence(event(1), true), evidence(event(2), false), ...

query(c). # should give the P(b>1 | observations)

This is a minimal example, but in general this ability combined with everything else you guys have should be extremely powerful for many data driven applications.

shuvrobiswas commented 4 years ago

@pedrozudo just to confirm, I believe @VincentDerk has correctly identified the underlying issue and opened a separate thread for it here: https://github.com/ML-KULeuven/problog/issues/40 - let me know if there is any I can help here