starling-lab / BoostSRL

BoostSRL: "Boosting for Statistical Relational Learning." A gradient-boosting based approach for learning different types of SRL models.
https://starling.utdallas.edu
GNU General Public License v3.0
32 stars 21 forks source link

Multiple Targets MLN #43

Open laiprorus opened 3 years ago

laiprorus commented 3 years ago

Hello BoostSRL Team! After reading your paper Learning Markov Logic Networks via Functional Gradient Boosting http://pages.cs.wisc.edu/~tushar/papers/icdm11.pdf i have been trying to learn MLN with multiple targets. But so far i did not have much success. How do i set up the learning flags and my data properly? Here is what i have tried to do.

Toy-Father

I have manually added female and mother predicates. For Tree-Based i run following flags: -mln -trees 10 -l -train datasets\Father-Mother\train\ -target father,mother the result is 2 models but they appear to be the same as if i learned each target seperatly. I was hoping to see a Joint Model where the 2 models would somehow influence each other. For Clause-Based i used: -mln -mlnClause -trees 10 -l -train datasets\Father-Mother\train\ -target father,mother But the code crashes with following error: Unbenannt1111 It appears that it is generating wrong examples (mother example for father target)

Toy-Cancer

I was trying to learn a Joint Model for Cancer dataset. In default setting it is meant to learn just the predicate cancer. train_factscontains only friends and smokes predicates and train_pos/train_neg containts exampels for _cancer. How do i set up to learn all 3 predicates (friends, smokes, cancer) at once? I tried moving all the facts to train_pos/train_negor duplicating them in train_factsand train_pos/train_neg(with proper negative examples) but all i was getting were erros like this: Unbenannt18111 In both Tree-Based and Clause-Based settings. Even trying to learn 2 predicates out of 3 had similar error messages.

Cora

In your paper you have results for both Tree-Based and Clause-Based approaches on Cora datasets when learning Joint Model for Cora dataset with target predicates SameBib, SameVenue, SameTitleand SameAuthor. I tried to do Clause-Base learning: -mln -mlnClause -trees 10 -l -train datasets\Cora\train\ -target sameauthor,samebib,sametitle,samevenue But after less than a minute i get an error: Unbenannt117711

I looked at the source code and wiki/documentation and didnt find much on working with multiple targets. Since you got the results in your paper i do really hope that you can help me. Thank you in advance from D.Ravdin!

laiprorus commented 3 years ago

Some additional observations. Learning multiple target with RDN looks quite good. I could even learn ALL predicates for randomly generated Friends&Smokers dataset, with train_facts.txt being empty. cmd args: -trees 10 -l -train datasets\Cancer\train\ -target friends,cancer,smokes and BK:

useStdLogicVariables: true.
setParam: treeDepth=4.
setParam: nodeSize=2.
setParam: numOfClauses=8.
mode: friends(+Person, -Person).
mode: friends(-Person, +Person).
mode: smokes(+Person).
mode: cancer(+Person).
bridger: friends/2.

i then added

setParam: recursion=true.

mode: recursive_smokes(`Person).
okIfUnknown: recursive_smokes/1.

mode: recursive_cancer(`Person).
okIfUnknown: recursive_cancer/1.

and it worked well! it could find clauses such as smokes(a) <- friends(a,b),smokes(b)

thereroe i think i did set up the learning parameters and dataset properly.

But sadly nothing worked as soon as i tried to learn MLN by using same parameters and data and just adding -mln flag. there errors are similar as above such as Unbenannt11

here is the dataset from the run when this error happened Toy-Cancer-All.zip and cmd arguments -mln -trees 10 -l -train datasets\Toy-Cancer-All\train\ -target friends,cancer,smokes

boost-starai commented 3 years ago

Thanks Dmitri.

I am checking this. I'll get back to you soon. We had joint learning in one of the versions set up. I want to track that down. I'll do that by the weekend.

On Jul 14, 2021 8:10 AM, Dmitriy @.***> wrote: This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.

Some additional observations. Learning multiple target with RDN looks quite good. I could even learn ALL predicates for randomly generated Friends&Smokers dataset, with train_facts.txt being empty. cmd args: -trees 10 -l -train datasets\Cancer\train\ -target friends,cancer,smokes and BK:

useStdLogicVariables: true. setParam: treeDepth=4. setParam: nodeSize=2. setParam: numOfClauses=8. mode: friends(+Person, -Person). mode: friends(-Person, +Person). mode: smokes(+Person). mode: cancer(+Person). bridger: friends/2.

i then added

setParam: recursion=true.

mode: recursive_smokes(`Person). okIfUnknown: recursive_smokes/1.

mode: recursive_cancer(`Person). okIfUnknown: recursive_cancer/1.

and it worked well! it could find clauses such as smokes(a) <- friends(a,b),smokes(b)

thereroe i think i did set up the learning parameters and dataset properly.

But sadly nothing worked as soon as i tried to learn MLN by using same parameters and data and just adding -mln flag. there errors are similar as above such as [Unbenannt11]https://user-images.githubusercontent.com/33106132/125627205-b7776a48-1feb-42c8-9cd5-02730e5f6e5b.JPG

here is the dataset from the run when this error happened Toy-Cancer-All.ziphttps://github.com/starling-lab/BoostSRL/files/6816237/Toy-Cancer-All.zip and cmd arguments -mln -trees 10 -l -train datasets\Toy-Cancer-All\train\ -target friends,cancer,smokes

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/starling-lab/BoostSRL/issues/43#issuecomment-879878028, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AF4KTVPPLWZWBNOXHC5RHRLTXWED3ANCNFSM5AHATFGA.