opcode81 / ProbCog

A toolbox for statistical relational learning and reasoning.
GNU General Public License v3.0
101 stars 26 forks source link

Declaring additional predicates #25

Closed mcubuktepe closed 4 years ago

mcubuktepe commented 4 years ago

Hi,

I was getting familiar with BLNs using the alarm example, and I had trouble defining additional predicates using the blnquery tool.

I was able to run the tool with the following inputs as zip file files_alarm.zip , and I get the following output for query "alarm, burglary" using Gibbs sampling, which is sensible:

alarm(James): 0.6591 True 0.3409 False alarm(Stefan): 1.0000 True 0.0000 False burglary(James): 1.0000 True 0.0000 False burglary(Stefan): 0.6079 True 0.3921 False

However, if I try to add a new predicate that indicates the timing of the tornado by adding

" type timer; guaranteed timer = {1,2,3}; random boolean tornado_timing(place,timer) "

to the declarations, and adding

" tornado_timing(Freiburg,1)=True tornado_timing(Freiburg,2)=False tornado_timing(Freiburg,3)=False "

to the evidence, I get the following error:

No relational node was found that could serve as the template for the variable tornado_timing(Freiburg,1).

I wanted to figure out my mistake while adding predicates. Thanks a lot.

mcubuktepe commented 4 years ago

I also wanted to add that I got the following exception while trying to learn the netEd tool with Ubuntu 18.04 and following java version:

openjdk version "11.0.7" 2020-04-14 OpenJDK Runtime Environment (build 11.0.7+10-post-Ubuntu-2ubuntu218.04) OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Ubuntu-2ubuntu218.04, mixed mode, sharing)

java.lang.NullPointerException at org.eclipse.swt.widgets.TabFolder.gtk_switch_page(Unknown Source) at org.eclipse.swt.widgets.Widget.windowProc(Unknown Source) at org.eclipse.swt.widgets.Display.windowProc(Unknown Source) at org.eclipse.swt.internal.gtk.OS._gtk_widget_show(Native Method) at org.eclipse.swt.internal.gtk.OS.gtk_widget_show(Unknown Source) at org.eclipse.swt.widgets.TabFolder.createItem(Unknown Source) at org.eclipse.swt.widgets.TabItem.createWidget(Unknown Source) at org.eclipse.swt.widgets.TabItem.(Unknown Source) at edu.ksu.cis.bnj.gui.GUIWindow.run(GUIWindow.java:902) at edu.ksu.cis.bnj.gui.GUIThread.run(GUIThread.java:66) at org.eclipse.swt.widgets.Synchronizer.syncExec(Unknown Source) at org.eclipse.swt.widgets.Display.syncExec(Unknown Source) at edu.ksu.cis.bnj.gui.GUIWindow.open(GUIWindow.java:659) at edu.ksu.cis.bnj.gui.GUIWindow.open(GUIWindow.java:665) at probcog.bayesnets.core.BeliefNetworkEx.show(BeliefNetworkEx.java:691) at probcog.BNJ.main(BNJ.java:54)

Thanks a lot!

opcode81 commented 4 years ago

Your fragment network does not contain any nodes for tornado_timing, therefore the error message is correct. (Is it unclear?) What effect should tornado_timing have on the distribution? If it is to have a probabilistic effect, you need fragments. If you want tornado_timing to be purely an evidence predicate which will occur only in preconditions, for example, then you should declare it as logical rather than random. Declaring a new set of random variables without defining their impact on the distribution won't work.

I have created a new issue for your other message, as it is unrelated.

mcubuktepe commented 4 years ago

Hi, thank you for your response. It is clearer now that I need to add tornado_timing in the pmml file. I have a couple of questions regarding the format.

For instance, I can add the following node and run the blnlearn tool successfully, but it assumes that the weight of the implication is given. Is it possible to leave the weight of the implication open and learn it using the data?

    <DataField name="tornado_timing(pl,timer)" optype="categorical" id="6">
        <Extension>
            <X-NodeType>chance</X-NodeType>
            <X-Position x="136" y="115" />
            <X-Definition>
                <X-Given>1</X-Given> <!-- tornado(pl) -->
                <X-Table>0.5 0.5</X-Table>
            </X-Definition>
        </Extension>
        <Value value="True" />
        <Value value="False" />
    </DataField>

Additionally, (pl,timer) is a 2x3 array in my example, but when I try to supply an X-Table with 6 entries as follows:

0.5 0.5 0.8 0.2 0.9 0.1 0.2 0.8 0.1 0.9 0.3 0.7

with 12 values showing the probability of having a True of False given pl and timer, I get the following exception:

java.lang.ArrayIndexOutOfBoundsException: Tried to add value to index 4 of CPF of tornado_timing(pl,timer); CPF has but 4 entries.

I assume it is easier to specify the network through the GUI tool, but I also wanted to figure out the syntax of the pmml file. Thanks a lot!

opcode81 commented 4 years ago

If pl and timer are variables, then the fragment applies to ALL values of pl and timer - and having no parents, it can only have two parameters. If you want to have separate probabilities for different values, you will need a new fragment for all instantiations of the variables - not one fragment with many more parameters. But is this really what you want? I suppose you need to ask yourself what you are ultimately trying to achieve. Adding prior probabilities for tornado_timing only, without tornado_timing influencing anything else, is not going to be particularly useful.

mcubuktepe commented 4 years ago

Hi,

As a toy example, I want to learn the weight of the following relationship

"tornado_timing(p,1) ^ livesIn(q,p) => alarm_timing(q,1) v alarm_timing(q,2) v alarm_timing(q,3)."

Meaning, if a tornado happens in time steps 1 at the place "p", and if the person "q" lives in "p", then the alarm will be true in at least one of the future time steps. I think it should be possible to express this relationship in the network, but I wanted to ask if it is possible. Thanks a lot for your help!

opcode81 commented 4 years ago

BLNs are not designed to learn weighted formulas. However, it seems that you equate learning the weight of the formula with inferring the conditional probability of the consequent given the antecedent (in some model where there is a corresponding dependency). The two are completely different problems - and the former probably isn't one you should want to solve. If you don't know why shouldn't, you might want to read my paper on "Knowledge Engineering with Markov Logic Networks".

mcubuktepe commented 4 years ago

Hi,

I actually read your paper and I am aware that learning the weight may not necessarily imply the conditional probability. I wanted to figure out if it is possible to learn the weights of the formulas in BLNs using data instead of manually defining them, as it may cause unwanted biases as you mentioned.

opcode81 commented 4 years ago

When using BLNs, you would learn conditional distributions, not weighted formulas. The conditional distribution "implied" by your formula is not one you would learn, however. Rather, it is one that results from the interplay of other conditional distributions, i.,e. one you would infer from more specific ones that make sense to model.

mcubuktepe commented 4 years ago

I understand it with more details now. I think interpreting conditional distributions is easier than interpreting the effect of learned weights for an MLN, also BLNs seem to be more scalable then MLNs.