ML-KULeuven / problog

ProbLog is a Probabilistic Logic Programming Language for logic programs with probabilities.
https://dtai.cs.kuleuven.be/problog/
297 stars 34 forks source link

Extremely long computation time for LFI in practical cases #114

Open Zarach opened 2 months ago

Zarach commented 2 months ago

Dear ProbLog Team,

I'm trying to use the Noisy-Or example for a rather practical case, with 4 Topics and 15 words. I use about 1000 examples to learn the parameters where every example has complete evidence about the use of words.

The first iterations are more or less fast, but then the learning process gets slower from iteration to iteration. I'm running the process on a server which is not nearly busy while it runs.

Therefore I'm wondering if this is normal behavior (or if I'm doing anything wrong) and if it's even possible then to run parameter learning (Noisy Or) for even bigger corpora.

rmanhaeve commented 2 months ago

Hi Zarach

Could give an example of a program where this issue occurs?

Zarach commented 2 months ago

Hi rmanhaeve,

here is an example of the program:

the underscores are just somehow removed by the formatting.

0.0001::word(). t()::class(topic1). t()::class(topic1) :- word(w1). t()::class(topic1) :- word(w2). t()::class(topic1) :- word(w3). t()::class(topic1) :- word(w4). t()::class(topic1) :- word(w5). t()::class(topic1) :- word(w6). t()::class(topic1) :- word(w7). t()::class(topic1) :- word(w8). t()::class(topic1) :- word(w9). t()::class(topic1) :- word(w10). t()::class(topic1) :- word(w11). t()::class(topic1) :- word(w12). t()::class(topic1) :- word(w13). t()::class(topic1) :- word(w14). t()::class(topic1) :- word(w15). t()::class(topic2). t()::class(topic2) :- word(w1). t()::class(topic2) :- word(w2). t()::class(topic2) :- word(w3). t()::class(topic2) :- word(w4). t()::class(topic2) :- word(w5). t()::class(topic2) :- word(w6). t()::class(topic2) :- word(w7). t()::class(topic2) :- word(w8). t()::class(topic2) :- word(w9). t()::class(topic2) :- word(w10). t()::class(topic2) :- word(w11). t()::class(topic2) :- word(w12). t()::class(topic2) :- word(w13). t()::class(topic2) :- word(w14). t()::class(topic2) :- word(w15). t()::class(topic3). t()::class(topic3) :- word(w1). t()::class(topic3) :- word(w2). t()::class(topic3) :- word(w3). t()::class(topic3) :- word(w4). t()::class(topic3) :- word(w5). t()::class(topic3) :- word(w6). t()::class(topic3) :- word(w7). t()::class(topic3) :- word(w8). t()::class(topic3) :- word(w9). t()::class(topic3) :- word(w10). t()::class(topic3) :- word(w11). t()::class(topic3) :- word(w12). t()::class(topic3) :- word(w13). t()::class(topic3) :- word(w14). t()::class(topic3) :- word(w15). t()::class(topic4). t()::class(topic4) :- word(w1). t()::class(topic4) :- word(w2). t()::class(topic4) :- word(w3). t()::class(topic4) :- word(w4). t()::class(topic4) :- word(w5). t()::class(topic4) :- word(w6). t()::class(topic4) :- word(w7). t()::class(topic4) :- word(w8). t()::class(topic4) :- word(w9). t()::class(topic4) :- word(w10). t()::class(topic4) :- word(w11). t()::class(topic4) :- word(w12). t()::class(topic4) :- word(w13). t()::class(topic4) :- word(w14). t(_)::class(topic4) :- word(w15).

Then there are about 1000 complete examples in the following format:

evidence(class(topic1),true). evidence(class(topic2),false). evidence(class(topic3),false). evidence(class(topic4),false). evidence(word(w1),true). evidence(word(w2),false). evidence(word(w3),false). evidence(word(w4),false). evidence(word(w5),false). evidence(word(w6),false). evidence(word(w7),true). evidence(word(w8),false). evidence(word(w9),false). evidence(word(w10),false). evidence(word(w11),false). evidence(word(w12),false). evidence(word(w13),false). evidence(word(w14),false). evidence(word(w15),false).

Some of them have no evidence for any of the words.

rmanhaeve commented 2 months ago

Hi Zarach

Could you also tell me what knowledge compiler you are using. Do you have the PySDD package installed?

Zarach commented 2 months ago

Hi rmanhaeve,

PySDD is installed. I did not choose the compiler explicitly, therefore I think that SDD is used. In addition I used the logspace parameter because it made the process at least a bit faster.

rmanhaeve commented 1 month ago

Hi Zarach

I have been looking into the issue a bit, but I have not been able to pinpoint the issue you've been having. I'll label it as a (potential) bug, which we'll have to look into later.

Kind regards, Robin