ML-KULeuven / problog

ProbLog is a Probabilistic Logic Programming Language for logic programs with probabilities.
https://dtai.cs.kuleuven.be/problog/
297 stars 34 forks source link

Strange scoring method in LFI #99

Open plison opened 1 year ago

plison commented 1 year ago

In the parameter learning code (https://github.com/ML-KULeuven/problog/blob/master/problog/learning/lfi.py), I noticed something strange in the method _update employed to update the weights. In the first part of the method (lines 849 - 872), a score scalar is computed based on the log-probabilities of the evidence. So far, so good, it seems in line with an EM-based learning method.

However, further down in the same method, this score is reset to 0 (line 886), and is recomputed based on the sum of the log-probabilities of the weights, meaning that a low score will be achieved if the rule weights are low. Which does not make sense to me: the loss should be computed based on the likelihood of the evidence (as done above), not on the values of the tunable weights. And a ProbLog program that ends up with high weights for its probabilistic facts should not have a higher loss than a one with lower weights.

Or maybe did I misunderstand what this method is doing?

wenchiyang commented 1 year ago

You are right. score in the first part of the method is interpreted as a loss, as in standard EM-based learning. We have removed score in the second part. The updated code is here right now and will soon be updated to the current repo.

Here is the intuition behind score in the second part. It is not interpreted as a loss but just a measure of the improvement wrt the weights -- If score{t+1}-score{t} is smaller than a threshold, then stop learning (i.e. delta in the run method). It is experimental and works well in our limited test cases. But indeed, it could be unstable when there are multiple tunable parameters. For example, when the value of the first tunable parameter increases and the value of the second decreases.