Potentially wrong inference results on Yelp dataset.

Modestas96 commented 3 years ago

Hi, so I was recently analyzing the grounded rule results while running ./run.sh script with --satisfaction argument on the yelp dataset and comparing them with my implementation of the Łukasiewicz implication operator. I get some mismatches.

For example, this one:

0.93733215 false ( ~( SIM_CONTENT_ITEMS_JACCARD('2419', '3212') ) | ~( RATED('1538', '3212') ) | ~( RATING('1538', '2419') ) | ~( RATED('1538', '2419') ) | RATING('1538', '3212') ) 0.9905105829238892

Which could be rewritten as:

SIM_CONTENT_ITEMS_JACCARD('2419', '3212') & RATED('1538', '3212') & RATING('1538', '2419') & RATED('1538', '2419') => RATING('1538', '3212')

If we ground each atom according to the files in data/yelp/0/eval we get the following:

1 & 1 & 1.0 & 1 => RATING('1538', '3212')

The MAP estimate of RATING('1538', '3212') is 0.7889187 (according to the inference RATING.txt file)

So the evaluation of the following formula should be equal to:

1 & 1 & 1.0 & 1 => 0.7889187 ~ 0.7889187 instead of 0.9905105829238892.

Since SIM_CONTENT_ITEMS_JACCARD and RATED are always evaluated to 1, I bet there might be a problem with grounding the atom RATING('1538', '2419'), which according to the file data/yelp/0/eval/rating_truth.txt, is equal to 1.0. I bet it grounds it to 0.8 for some reason, but I could be wrong.

eriq-augustine commented 3 years ago

What split and version of PSL are you running?

Modestas96 commented 3 years ago

I'm running split '0', and psl-cli-2.3.0-SHAPSHOT

eriq-augustine commented 3 years ago

What is the MAP prediction for RATED('1538', '2419')? They are both unobserved in split 0.

Modestas96 commented 3 years ago

Oh, it's ~0.79 So it takes the MAP estimate from there. Now that does make sense a lot now, I somehow thought that it takes the grounding value from the target.txt which evaluates it to 1.0; Sorry for the inconvenience.

eriq-augustine commented 3 years ago

No problem, glad we were able to clear it up.

linqs / psl-examples

Potentially wrong inference results on Yelp dataset. #17