starling-lab / BoostSRL

BoostSRL: "Boosting for Statistical Relational Learning." A gradient-boosting based approach for learning different types of SRL models.
https://starling.utdallas.edu
GNU General Public License v3.0
32 stars 21 forks source link

NullPointerException with -mlnClause. #36

Closed monicasenapati closed 4 years ago

monicasenapati commented 4 years ago

https://github.com/starling-lab/BoostSRL/blob/a120db979dd9aaa7d8cf765458fac82bf6ea4e36/src/edu/wisc/cs/will/ILP/ILPouterLoop.java#L1595

I am using BoostSRL to learn on our dataset using the command: java -jar BoostSRL.jar -l -mln -mlnClause -train train_malicious/ -target malicious This results in the error: Exception in thread "main" java.lang.NullPointerException at edu.wisc.cs.will.ILP.ILPouterLoop.produceFinalTheory(ILPouterLoop.java:1616) at edu.wisc.cs.will.ILP.ILPouterLoop.executeOuterLoop(ILPouterLoop.java:1093) at edu.wisc.cs.will.Boosting.RDN.LearnBoostedRDN.getWILLTree(LearnBoostedRDN.java:396) at edu.wisc.cs.will.Boosting.RDN.LearnBoostedRDN.learnRDN(LearnBoostedRDN.java:234) at edu.wisc.cs.will.Boosting.RDN.LearnBoostedRDN.learnNextModel(LearnBoostedRDN.java:129) at edu.wisc.cs.will.Boosting.MLN.RunBoostedMLN.learn(RunBoostedMLN.java:147) at edu.wisc.cs.will.Boosting.Common.RunBoostedModels.learnModel(RunBoostedModels.java:77) at edu.wisc.cs.will.Boosting.Common.RunBoostedModels.runJob(RunBoostedModels.java:54) at edu.wisc.cs.will.Boosting.Common.RunBoostedModels.main(RunBoostedModels.java:220)

However, running the same command without using the -mlnClause parameter, i.e java -jar BoostSRL.jar -l -mln -train train_malicious/ -target malicious works. In other words, it learns mlns through tree representation, and not through clausal representation.

I need to perform learning on MLNs using clausal representation. Is there any extra parameters that needs to be provided with -mlnClause for this to work?

nandhiniramanan5 commented 4 years ago

Closing the issue. Refer below thread for details.


Hi Nandini,

Thank you so much for the clarification.

Thanks & regards, Monica

On May 24, 2020, at 3:09 PM, Ramanan, Nandini Nandini.Ramanan@utdallas.edu wrote:

Hi Monica,

To answer your question, tree representation also returns horn clauses with similar results. W.r.t the learning procedure, clause learning is expected to be more efficient, given that false branch sometimes introduces existential variables in RRT, making inference slow. In regression clauses, we force the weights on false branch to be 0.

So in your case, you can very well use the tree representation for MLNs. Let me know if you have additional questions.

Regards, Nandini Ramanan

From: Senapati, Monica (UMKC-Student) msenapati@mail.umkc.edu Sent: Sunday, May 24, 2020 3:00 PM To: Ramanan, Nandini Nandini.Ramanan@utdallas.edu Cc: Rao, Praveen praveen.rao@missouri.edu; Natarajan, Sriraam Sriraam.Natarajan@utdallas.edu Subject: Re: BoostSRL: Issue with MLN learning via clausal representation

Dear Dr. Natarajan,
Thank you so much for connecting to your students.

Hi Nandini, Thank you for a quick response. The paper mentions learning of Horn clauses, when MLNs are learnt via clausal representation. Could you please confirm the difference in the clauses being returned in case of learning via trees as well as clausal representation?

Thanks & regards, Monica

On May 24, 2020, at 2:13 PM, Ramanan, Nandini Nandini.Ramanan@utdallas.edu wrote:

Hi Monica,

Thanks for pointing this out to us.

We will look into the error and get back to you, once we have a fix. Meanwhile, I would recommend learning tree representations for MLNs, which also returns clauses in the end and are also more effective. The instructions to learn trees for modelling MLNs are present in the ticket you raised under https://github.com/starling-lab/BoostSRL/issues/34.

Take care, stay safe.

Regards, Nandini Ramanan

From: Natarajan, Sriraam Sriraam.Natarajan@utdallas.edu Sent: Sunday, May 24, 2020 1:22 PM To: Senapati, Monica (UMKC-Student) msenapati@mail.umkc.edu Cc: Rao, Praveen praveen.rao@missouri.edu; Kokel, Harsha hkokel@utdallas.edu; Dhami, Devendra Singh dsd170230@utdallas.edu; Ramanan, Nandini Nandini.Ramanan@utdallas.edu; Kaur, Navdeep Navdeep.Kaur@UTDallas.edu Subject: RE: BoostSRL: Issue with MLN learning via clausal representation

Dear Monica

Thanks for the note and thanks for your interest in running our code. We have seen this bug before. I am cc’ing some students who are on top of this. I am sure one of them can help you fix this.

Thanks SN

Hello Dr. Natarajan,

I hope this email finds you well. I am a PhD student at University of Missouri, Kansas City, working under my advisor, Dr. Praveen Rao (cc’ed in the email). This email is in reference to the BoostSRL for MLN learning (Tushar Khot, Sriraam Natarajan, Kristian Kersting, Jude Shavlik.Learning Markov Logic Networks via Functional Gradient Boosting. In ICDM 2011). First of all, thank you so much for the well-maintained and accessible Github repository, because of which we are able to use the software for taking our research forward. I am using the same tool for MLN learning on our social media dataset, using the command: java -jar BoostSRL.jar -l -mln -mlnClause -train train/ -target targetB

We have two target variables (say, targetA and targetB), for which we want to learn the MLNs and infer. The learning through tree representation works just fine for both targets. However, if I try to do the learning through clausal representation, using the -mlnClause parameter, it works with targetA, resulting in it’s appropriate .mln file. But fails to learn any model for targetB with the following exception on the same set of facts file, positive and negative examples (which contain both the target variables.): Exception in thread "main" java.lang.NullPointerException at edu.wisc.cs.will.ILP.ILPouterLoop.produceFinalTheory(ILPouterLoop.java:1616) at edu.wisc.cs.will.ILP.ILPouterLoop.executeOuterLoop(ILPouterLoop.java:1093) at edu.wisc.cs.will.Boosting.RDN.LearnBoostedRDN.getWILLTree(LearnBoostedRDN.java:396) at edu.wisc.cs.will.Boosting.RDN.LearnBoostedRDN.learnRDN(LearnBoostedRDN.java:234) at edu.wisc.cs.will.Boosting.RDN.LearnBoostedRDN.learnNextModel(LearnBoostedRDN.java:129) at edu.wisc.cs.will.Boosting.MLN.RunBoostedMLN.learn(RunBoostedMLN.java:147) at edu.wisc.cs.will.Boosting.Common.RunBoostedModels.learnModel(RunBoostedModels.java:77) at edu.wisc.cs.will.Boosting.Common.RunBoostedModels.runJob(RunBoostedModels.java:54) at edu.wisc.cs.will.Boosting.Common.RunBoostedModels.main(RunBoostedModels.java:220)

Now, we debugged the code and it turns out that using the clausal representation for learning MLNs, does not produce any clauses on targetB. The issue continues even after varying the parameters for the number of clauses and the length of the clauses learnt during each gradient step (parameters “ numMLNClause” and “mlnClauseLen”). However, the MLN learning with tree representation is able to produce the trees as well as the necessary model to infer from. Could you please shed some light on why this issue could be occurring? Also, please let me know if you would require additional information.

Thank you for your time. Looking forward to hearing back from you.

Thanks & regards, Monica