Hanging Collective Transitive Rules

eriq-augustine commented 2 years ago

Hey All,

I am having some issues with transitive rules and I was wondering if my results were expected or if there may be a bug lurking here.

I have a transitive rule that I have tried to implement in two different ways, both of which have unfavorable outcomes (consuming all memory and a presumably infinite loop).

All methods were run using the CLI from the ProbLog Python package (from PyPi, version 2.2.2):

python -m problog -v test_file.txt

Any help you can provide in getting either of these methods (or any other method of dealing with collective transitivity) would be greatly appreciated.

General Model

This is a small synthetic model with the end goal of inferring whether two people know each other. The data and structure for this model comes from the psl-examples repository: https://github.com/linqs/psl-examples/tree/develop/simple-acquaintances The specific rules that my initial ProbLog rules are based on come from here: https://github.com/linqs/psl-examples/blob/develop/simple-acquaintances/cli/simple-acquaintances.psl

These examples will only use three predicates:

knows/2 - The final target predicate, indicating two people know each other.
knows_l1/2 - An intermediate predicate for knows/2.
knows/3 - Used in the second method to carry a list of seen nodes.

In the examples I will provide, I stripped down the rules and data to the smallest set that still causes these issues (I have not fully tested every possible subset of the data since these can take a while, but I have cut it down considerably).

First Method

The first method uses a very straightforward approach to transitivity (with the hopes that memoization will stop cycles from happening).

knows(P1, P2) :- knows_l1(P1, P2), (P1 =\= P2) .
0.167 :: knows(P1, P2) :- knows_l1(P1, OtherPerson), knows(OtherPerson, P2), (P1 =\= OtherPerson), (P1 =\= P2) .

1.0 :: knows_l1(0, 2) .
% ... More data here.

query(knows(0, 1)) .

The probabilistic facts enumerates through [0, 9] for each argument and excludes self references and the query (0, 1). The full file is attached: problog_transitive_method1.txt I also attached the output of a profile (py-spy top): problog_transitive_method1_profile_top.txt

When run, it hangs on the output:

[INFO] Output level: INFO
[INFO] Propagating evidence: 0.0000s
[INFO] Grounding: 0.0396s
[INFO] Cycle breaking: 2.8354s

The process will continue to use ~100% of a core and will keep accumulating memory until there is no more or it is killed. On my machine, the this took about 50GB of RAM and swap combined.

Second Method

The second method uses a commonly recommended pattern for transitivity in ProLog, by keeping a list of previously seen nodes.

:- use_module(library(lists)).

knows(P1, P2) :- knows(P1, P2, []) .

knows(P1, P2, V) :-
    knows_l1(P1, P2),
    \+ memberchk((P1, P2), V) .

knows(P1, P2, V) :-
    knows_l1(P1, OtherPerson),
    \+ memberchk((P1, OtherPerson), V),
    knows(OtherPerson, P2, [(P1, OtherPerson) | V]) .

1.0 :: knows_l1(0, 2) .
% ... More data here.

query(knows(0, 1)) .

The probabilistic facts enumerates through [0, 4] for each argument and excludes self references and the query (0, 1). The full file is attached: problog_transitive_method2.txt I also attached the output of a profile (py-spy top): problog_transitive_method2_profile_top.txt

When run, it hangs on the output:

[INFO] Output level: INFO
[INFO] Propagating evidence: 0.0000s

So it looks like it is stuck in grounding. The process will continue to use ~100% of a core, but will not use more RAM. I ran this for about 12 hours until I killed it.

rmanhaeve commented 2 years ago

Thank you for the detailed explanation. The programs you provided contains some very cyclical rules. We are aware of some performance issues regarding cycle breaking, and we're looking into it right now. We'll let you know when we have an update.

eriq-augustine commented 2 years ago

Thanks, Robin. Keep me up to date and let me know if you need anything else.

VincentDerk commented 2 years ago

Hello, I looked into it a bit.

First Method

The compilation step, a #P-complete problem, causes the long processing time. I tested this on both SDD (-k sdd) and dSharp (-k ddnnf).

There might be a bug that makes some ground programs unnecessarily larger, something Robin and I were looking into, but I have yet to confirm whether that occurs here too. The ProbLog problem you gave seems to just scale terribly, and so does the compilation process.

Below are some statistics.

SDD

**Excluding nodes 7 and 8** - using SDD without minimization ground program size: 167 SDD size (all / live / dead): 148 444 / 54 197 / 94 247 SDD node (all / live / dead): 36 935 / 14 431 / 22 504 SDD compilation time: 0.2904s **Excluding nodes 7 and 8** - using SDD **with** minimization ground program size: 167 SDD size (all / live / dead): 17 950 / 10 293 / 7 657 SDD node (all / live / dead): 5 819 / 3 205 / 2 614 SDD compilation time: 1.1790s **Excluding node 8** - using SDD without minimization ground program size: 230 SDD size (all / live / dead): 6 363 661 / 1 519 273 / 4 844 388 SDD node (all / live / dead): 1 139 949 / 265 390 / 874 559 SDD compilation time: 27.3208s **Excluding node 8** - using SDD **with** minimization ground program size: 230 SDD size (all / live / dead): 252 681 / 180 991 / 71 690 SDD node (all / live / dead): 57 804 / 36 863 / 20 941 SDD compilation time: 50.3182s **Original problem** - using SDD without minimization ground program size: 303 _unable to complete_ **Original problem** - using SDD **with** minimization ground program size: 303 _unable to complete_

dSharp

**Excluding nodes 7 and 8** ground program size: 167 ground cycle-free program size: 528 CNF vars / clauses: 528 / 1489 compilation time: crashed at 984s **Excluding node 8** ground program size: 230 ground cycle-free program size: 1588 CNF vars / clauses: 1588 / 4876 compilation time: _did not bother letting it continue_ **Original problem** ground program size: 303 ground cycle-free program size: 4548 CNF vars / clauses: 4548 / 14551 compilation time: _did not bother letting it continue_

Second Method

The issue is with grounding the program (so not even cycle breaking). Apparently, as is also seen in profile report, it is spending most of the time in the Term's __eq__ function.

Since this is not cycle breaking related, and Robin knows most about the grounding, I'll reassign him. I will revisit the First Method to see if the ground program contains anything unnecessary but I will have to attend to some other tasks first so it might take a few days.

Code I used to analyse each steps

Code

```python import time from problog.formula import LogicDAG from problog.program import PrologFile from problog.logic import Term from problog.engine import DefaultEngine from problog.ddnnf_formula import DDNNF from problog.cnf_formula import CNF from problog.sdd_formula import SDD class Timer(object): def __init__(self, msg): self.message = msg self.start_time = None def __enter__(self): self.start_time = time.time() def __exit__(self, *args): print("%s: %.4fs" % (self.message, time.time() - self.start_time)) def main(filename, k): model = PrologFile(filename) # default label_all=False engine = DefaultEngine(label_all=True) with Timer("parsing"): db = engine.prepare(model) print("\n=== Queries ===") queries = engine.query(db, Term("query", None)) print("Queries:", ", ".join([str(q[0]) for q in queries])) print("\n=== Evidence ===") evidence = engine.query(db, Term("evidence", None, None)) print("Evidence:", ", ".join(["%s=%s" % ev for ev in evidence])) print("\n=== Ground Program ===") with Timer("ground"): gp = engine.ground_all(db) print(gp) semiring = None if k == 'sdd': print("\n=== SDD ===") with Timer("SDD"): kc = SDD.create_from(gp, engine=engine, semiring=semiring, sdd_auto_gc=False) sdd_manager = kc.get_manager().get_manager() print(f"SDD size (all / live / dead): {sdd_manager.size()} / {sdd_manager.live_size()} / {sdd_manager.dead_size()}") print(f"SDD node (all / live / dead): {sdd_manager.count()} / {sdd_manager.live_count()} / {sdd_manager.dead_count()}") else: print("\n=== DAG ===") with Timer("DAG"): dag = LogicDAG.createFrom(gp) print(dag) print("\n=== DDNNF ===") with Timer("DDNNF"): cnf = CNF.create_from(dag, engine=engine, semiring=semiring) cnf_file_path = "./analysis_file_cnf.out" with open(cnf_file_path, "w") as cnf_file: cnf_file.write(cnf.to_dimacs()) print(f"Printed CNF file to {cnf_file_path}") kc = DDNNF.create_from(cnf, engine=engine, semiring=semiring) print("\n=== Evaluation ===") with Timer("Evaluation"): result = kc.evaluate(semiring=semiring) print(result) if __name__ == "__main__": import argparse argparser = argparse.ArgumentParser() argparser.add_argument("filename") argparser.add_argument("-k") args = argparser.parse_args() main(**vars(args)) ```

eriq-augustine commented 2 years ago

Thanks for the update. Let me know if you need anything on my side.

krishnangovindraj commented 2 years ago

About method 2: I don't think it's too surprising that it's slow. The grounding process grows exponentially with the number of knows_l1/2 facts. SWIPL too slows down on this problem. You can see this by running time(findall(0, knows(0,1), _)). and commenting out a few knows(X,Y) and knows(Y,X). (and removing the probability labels, ofcourse).

Tabling doesn't helps problog here, as the 3rd argument keeps changing. It's even possible that tabling is adding to the slowdown once the tables grow big.

eriq-augustine commented 2 years ago

@krishnangovindraj The exponential grounding time (and increased time due to memoizing the list) makes sense, but do you think that's all there is to it? I ran this for 12 hours at 4GHz. Do you think I should have just run it for longer, or there is a bug causing an infinite loop?

eriq-augustine commented 2 years ago

I have let an instance of the second method run for a week now and it is still going. It does slowly increase it's memory consumption, about 0.25 GB a day.

eriq-augustine commented 2 years ago

I was able to run the first method to completion on a larger machine. It took 27 hours and 148 GB, and the output looks like:

[DEBUG] Output level: DEBUG
[INFO] Propagating evidence: 0.0000s
[DEBUG] Grounding query 'knows(0,1)'
[DEBUG] Ground program size: 303
[DEBUG] Propagated evidence: []
[INFO] Grounding: 0.0596s
[DEBUG] Ground program size: 4548
[INFO] Cycle breaking: 2.6656s
[INFO] Clark's completion: 0.0127s
[INFO] DSharp compilation: 100008.6578s
[INFO] Total time: 100011.4259s
knows(0,1): 1

VincentDerk commented 2 years ago

I've finished tracking down a bug today. I hope to revisit this one in a few days but it might take a while. Something I did already notice and is perhaps related is the following:

The ground program of

:- use_module(library(lists)).
1/2::throw([1]).
t :- throw(L), member(4, L).
query(t).

should be empty since member(4,L) will never be true and is hence irrelevant to proof query(t). However, as you can see below, the ground program is not empty as it likely only realised afterwards that the atom was irrelevant. These unnecessary lines (or at least the ones I identified) do disappear when going to the LogicDAG after cycle breaking so they do not impact the compilation time. I doubt optimizing this will help the grounding time because you only notice later that it's irrelevant. It does however impact memory (and maybe cycle breaking run time?) so it might be something we can consider looking deeper into.

=== Ground Program ===
ground: 0.0010s
1: atom(identifier=11, probability=1/2, group=None, name=throw([1]), source=None, is_extra=False)
Queries : 
* t : None [query]

=== DAG ===

Queries : 
* t : None [query]

=== Evaluation ===
{t: 0.0}

ML-KULeuven / problog