opcode81 / ProbCog

A toolbox for statistical relational learning and reasoning.
GNU General Public License v3.0
101 stars 26 forks source link

Representing working BLOG model in netEd #2

Closed christiaanw closed 11 years ago

christiaanw commented 11 years ago

I'm interested in using ProbCog for an inference task similar to the one being performed in http://clair.si.umich.edu/~radev/papers/ICMLWS11.pdf .

I'm having trouble with representing the predicates with TabularCPDs in the fragments/xml file. The "hidden" variable leadsTo (page 5 of the article) is supposed to be learnt for every Event, Ailment combination. This is achieved via the Has predicate. In this predicate however an existential qualifier occurs and I don't seem to be able to get this represented in a network, even when using the decription nodes (the green boxes) for logical forms as depicted in the two articles by Jain et al.

This is the most reasonable representation I can obtain for the model, but providing blnlearn with training data in which one event is exclusively coupled to one ailment still yields a distribution in leadsTo of True: 0%, False: 100%

What am I not getting right here?

opcode81 commented 11 years ago

I cannot see the image you were trying to upload. Please retry.

Also, did you provide any positive data on the leadsTo predicate? Please also provide (an excerpt of) your training data.

christiaanw commented 11 years ago

I've given an assignment for leadsTo in the xml which starts out as a categorical 0.3 True and 0.7 False. I made no assignments for leadsTo in either the .blog (which is just type and function declarations) or the .blogdb.

out

I'm using a simple training data set for testing generated using genDB, in which Patients (persons) either have a beer or a soda, and will have a larger probability of getting drunk when having beer. That DB reads like this (excerpt):

have(Opatient103, Beer) = True; have(Opatient159, Soda) = True; have(Opatient68, Beer) = True; have(Opatient231, Soda) = True; have(Opatient205, Soda) = True; have(Opatient194, Beer) = True; status(Opatient29, Drunk) = True; have(Opatient183, Soda) = True; status(Opatient83, Drunk) = True; have(Opatient87, Beer) = True; have(Opatient129, Soda) = True; status(Opatient146, Drunk) = True; status(Opatient223, Drunk) = True; have(Opatient43, Soda) = True; have(Opatient114, Soda) = True; status(Opatient150, Drunk) = True; status(Opatient99, Drunk) = True; have(Opatient216, Beer) = True; status(Opatient64, Drunk) = True; status(Opatient68, Drunk) = True;

opcode81 commented 11 years ago

Note that blnlearn is a supervised learner, which by default will not use previous values as priors during learning. Therefore, if you do not provide data on leadsTo, it will not be able to learn its distribution. Further note that a decision node (green node) is supposed to be used to differentiate the applicability of probability fragments. (The most complete description of the BLN language can be found here: http://mediatum.ub.tum.de/doc/1096684/1096684.pdf)

However, it seems that leadsTo is irrelevant in your model anyway. If all you want to do is learn the connection between having particular drinks and the resulting status, I would suggest much simpler models, depending on your assumptions about the domain:

  1. If we know at learning time that Soda and Beer are the only two things one can have, which will influence a number of statuses, then I would suggest a model in which "have(p, Beer)", "have(p, Soda)" and "s" are the parents of "status(p, s)". This will directly capture the dependency between having the drinks and the status "Drunk" (as well as other statuses).
  2. If there can be many things one might have, which we do not assume to be known at learning time, then I'd suggest a model in which "have(p, x)", "x" and "s" are the parents of "status(p,s)|x" along with the definition of a combining rule for "status". For example, if we assume all the various x to be independent causes of a status, then the combining rule noisy-or would be appropriate (add the declaration "combining-rule status noisy-or;").
christiaanw commented 11 years ago

I've followed your second suggestion, as I'm looking to get a model working in which many antecedents can point to many different results with different probabilities.

I've constructed a toy model in which a Person can have either Milk (which makes them sleepy), Beer (which makes them drunk), or Coffee (which makes them hyper).

This is the BLOG model:

type person; type event; type attitude;

guaranteed person Alice, Bob guaranteed attitude Drunk, Sleepy, Hyper; guaranteed event Soda, Beer, Tea, Coffee;

random Boolean Has(person, attitude); random Boolean have(person,event);

combining-rule Has noisy-or;

I've constructed the network like this:

beverages1

And after training on a data set constructed with genDB, in which every person consumes a beverage and that beverage will get them into a single state:

have(Operson182, Coffee) = True; have(Operson74, Coffee) = True; Has(Operson215, Hyper) = True; Has(Operson196, Drunk) = True; have(Operson195, Milk) = True; Has(Operson200, Sleepy) = True; Has(Operson71, Hyper) = True; have(Operson55, Beer) = True; Has(Operson25, Drunk) = True; Has(Operson107, Hyper) = True; Has(Operson132, Hyper) = True;

I get the following learnt xml:

beverages2

In which the Has(p,a)|e shows different Attitude results a for different drink Events e, which is expected. But now I would like to plug in queries regarding new Person and the drinks they're having, as to the resulting mood. This is where the BLOG model from the paper I'm trying to adapt uses the predicate LeadsTo

I'm attempting this using a query database like:

have(P1, Beer) = True have(P2, Milk) = True have(P3, Tea) = True

(note: w/o the semicolons, as shown in the meals example). This can not obtain a countable sample in the number of trials given (even when increasing via maxTrials upto 1 million from 5000). The network size is 2040 nodes. Too big?

In the end I would like to use the scripting approach to be able to consult the model from Python.

opcode81 commented 11 years ago

If you want your model generalize across persons, you must not learn per-person dependencies, i.e. you must remove the constant node "p" from your model. By adding the node "p", the model learns distributions specific to all the people in your training database, which is not what you want. So remove the node and retrain your model.

christiaanw commented 11 years ago

That, and changing the node for have to +have, worked.

opcode81 commented 11 years ago

Great, I'm glad to hear it worked.