pm4py / pm4py-core

Public repository for the PM4Py (Process Mining for Python) project.
https://pm4py.fit.fraunhofer.de
GNU General Public License v3.0
699 stars 275 forks source link

pm4py.conformance.fitness_aligment crashing when using multiprocessing #395

Closed raseidi closed 1 year ago

raseidi commented 1 year ago

The following error is raised when using the fitness function:

concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

I get the error trying to measure the fitness alignment of a process model discovered from the bpi 19. As the event log is large and has long traces, using a single thread is also unfeasible.

I also tried to manually set a lower number of threads in the source code but it failed even though. I am using a laptop i7 with 20 cores and 32gb ram.

fit-alessandro-berti commented 1 year ago

Since the model is unknown, we'd ask you to perform some additional checks:

1) is your code contained in if name == "main": ? 2) can you try to remove cvxopt (pip uninstall cvxopt) and see if it still crashes? cvxopt is known to have a strange behavior with multiprocessing

raseidi commented 1 year ago

I am using the inductive miner.

  1. Yes.
  2. I removed it from my environment and I got a new error:
    Traceback (most recent call last):
    File "<stdin>", line 145, in <module>
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/conformance.py", line 246, in fitness_alignments
    return replay_fitness.apply(log, petri_net, initial_marking, final_marking,
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/algo/evaluation/replay_fitness/algorithm.py", line 94, in apply
    return exec_utils.get_variant(variant).apply(log,
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/algo/evaluation/replay_fitness/variants/alignment_based.py", line 121, in apply
    alignment_result = alignments.apply_multiprocessing(log, petri_net, initial_marking, final_marking, variant=align_variant,
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/algo/conformance/alignments/petri_net/algorithm.py", line 257, in apply_multiprocessing
    best_worst_cost = __get_best_worst_cost(petri_net, initial_marking, final_marking, variant, parameters)
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/algo/conformance/alignments/petri_net/algorithm.py", line 291, in __get_best_worst_cost
    best_worst_cost = exec_utils.get_variant(variant).get_best_worst_cost(petri_net, initial_marking, final_marking,
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/algo/conformance/alignments/petri_net/variants/state_equation_a_star.py", line 100, in get_best_worst_cost
    best_worst = apply(trace, petri_net, initial_marking, final_marking, parameters=parameters)
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/algo/conformance/alignments/petri_net/variants/state_equation_a_star.py", line 169, in apply
    alignment = apply_trace_net(petri_net, initial_marking, final_marking, trace_net, trace_im, trace_fm, parameters)
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/algo/conformance/alignments/petri_net/variants/state_equation_a_star.py", line 380, in apply_trace_net
    alignment = apply_sync_prod(sync_prod, sync_initial_marking, sync_final_marking, cost_function,
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/algo/conformance/alignments/petri_net/variants/state_equation_a_star.py", line 410, in apply_sync_prod
    return __search(sync_prod, initial_marking, final_marking, cost_function, skip,
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/algo/conformance/alignments/petri_net/variants/state_equation_a_star.py", line 444, in __search
    h, x = utils.__compute_exact_heuristic_new_version(sync_net, a_matrix, h_cvx, g_matrix, cost_vec, incidence_matrix,
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/objects/petri_net/utils/align_utils.py", line 267, in __compute_exact_heuristic_new_version
    sol = lp_solver.apply(cost_vec, g_matrix, h_cvx, a_matrix, b_term, parameters=parameters_solving,
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/util/lp/solver.py", line 115, in apply
    return VERSIONS_APPLY[variant](c, Aub, bub, Aeq, beq, parameters=parameters)
    File "/home/seidi/miniconda3/envs/py38/lib/python3.8/site-packages/pm4py/util/lp/variants/scipy_solver.py", line 37, in apply
    sol = linprog(c, A_ub=Aub, b_ub=bub, A_eq=Aeq, b_eq=beq, method=method, integrality=integrality)
    TypeError: linprog() got an unexpected keyword argument 'integrality'
fit-alessandro-berti commented 1 year ago

Dear @raseidi

Please update scipy: pip install -U scipy

raseidi commented 1 year ago

@fit-alessandro-berti the original problem persists.

fit-alessandro-berti commented 1 year ago

After investigation, I think the problem is the memory consumption due to the large number of states visited (yes, it can reach 32 GB).

We added the "variant_str" parameter in the "pm4py.conformance_diagnostics_alignments" that accepts the name of the variant to be used in the multi-processing. You could try our least memory intensive variant, that is "Variant.DIJKSTRA_LESS_MEMORY".

Cheers