HazyResearch / deepdive

DeepDive
deepdive.stanford.edu
1.95k stars 542 forks source link

How can I use deepdive to perform unsupervised verification? #608

Open rudaoshi opened 7 years ago

rudaoshi commented 7 years ago

Hi, I have a unlabeled data set in which each sample has an initial confidence. I want to do some verification by writing rules about commonsense to reduce the confidences of samples which violate commonsense. How can I do this?

1) How can I use the initial confidence? 2) How can I assign weights to the commonsense rules?

Thank you very much!

rudaoshi commented 7 years ago

When I want to define a unsupervised model by using "p(x,y)=NULL: ....", the program report:

column "label" is of type boolean but expression is of type text

It seems that the error is about a bug in sql generation module.

chrismre commented 7 years ago

You may want to check out http://hazyresearch.github.io/snorkel/, which is more directly about weak supervision.

DeepDive can do all this (and much more!). This flexibility means that not all representation decisions are obvious.

Chris

On Mon, Dec 19, 2016 at 8:24 PM 孙明明 notifications@github.com wrote:

When I want to define a unsupervised model by using "p(x,y)=NULL: ....", the program report:

column "label" is of type boolean but expression is of type text

It seems that the error is about a bug in sql generation module.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/HazyResearch/deepdive/issues/608#issuecomment-268151386, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPtuEFo-aPa3pwU-pGppjuv12ulh3Pfks5rJ1iIgaJpZM4LRaM2 .

rudaoshi commented 7 years ago

@chrismre Thank you for mentioning another amazing projects. I'll look at it.

However, currently I'd rather like to know how to do this by deepdive.

chrismre commented 7 years ago

DeepDive can express essentially any factor graph. You'll need to write the rules that create the required factor graph. A default factor graph for this process is described in the Snorkel/data programming paper.

Hope that helps! Chris

alldefector commented 7 years ago

@rudaoshi Here is an example rule with manually assigned weights (adapted from the "census" example):

@weight(1.2)
rich(id) :- adult(id, _, workclass, _, _, _, _, _, _, _, _, _, _, _, _, income_bracket).

You can change the rule body (the adult... part) to encode the "common sense" cases.

thodrek commented 7 years ago

@rudaoshi Regarding the NULL issue alone, you should try to use casting within ddlog. Writing = NULL::boolean instead of = NULL will take care of the label casting issue.

rudaoshi commented 7 years ago

Thank you very much to every one. I'll try your recommended solutions.

rudaoshi commented 7 years ago

@thodrek I got following error using Deepdive 0.8:

2016-12-21 11:43:00.679659 [error] app.ddlog[176.44] failure: :-' expected but:' found 2016-12-21 11:43:00.679741 2016-12-21 11:43:00.679756 has_relation(p1_id, p2_id, relation) = NULL:boolean 2016-12-21 11:43:00.679767 2016-12-21 11:43:00.679947 ^

Does this feature need newer version cloned from github?

alldefector commented 7 years ago

@rudaoshi note that there are two colons: NULL::boolean. If that doesn't work, try cast(null as boolean).