pm4py / pm4py-core

Public repository for the PM4Py (Process Mining for Python) project.
https://pm4py.fit.fraunhofer.de
GNU General Public License v3.0
722 stars 286 forks source link

IMf determinism #443

Closed ppfeiff closed 1 year ago

ppfeiff commented 1 year ago

Hi, we have found that the inductive miner implemented in pm4py discovers different models for the very same input, depending on the python version. Furthermore, it seems not to be completely deterministic: Using the same python version one can get different results from different runs.

We always used the MobIS event log. The pm4py command used to mine the model is the following:

log = read_xes("disc_ppa.xes")

pn, im, fm = discover_petri_net_inductive(log, activity_key="concept:name", case_id_key="case:concept:name", timestamp_key="time:timestamp", noise_threshold=0.2)

For example, using python 3.9 one gets the follwoing petri net mobis_inductive_2

and with version 3.11 the following one MobIS_pn_1

On 3.11 (as well as 3.9) when running the discovery multiple times, one can get two different models which are slightly different in terms of silent transitions, e.g. this one in addition to the previous one MobIS_pn_2

Since we were not able to find a explanation in the original paper we suspect that this behaviour is not intended? Some more information would be helpful.

Attached the log: disc_ppa.zip

Thank you!

fit-alessandro-berti commented 1 year ago

Dear @ppfeiff

There is indeed an undeterminism in the "activity concurrent" fallthrough. We will provide the possibility to disable such fallthroughs in a future release.

ppfeiff commented 1 year ago

Hi @fit-alessandro-berti thanks! Is this also the explanation for the differences between python versions?

fit-alessandro-berti commented 1 year ago

That, or pm4py version differences could also play a role (assure that everything is running under 2.7.6)

ppfeiff commented 1 year ago

We were testing with pm4py version 2.7.4

fit-alessandro-berti commented 1 year ago

Dear @ppfeiff

This seems indeed a problem related to the fallthroughs. You can find how to disable the fallthroughs detection in the inductive miner in the following example:

import pm4py
from pm4py.algo.discovery.inductive import algorithm as inductive_miner

if name == "main":
log = pm4py.read_xes("disc_ppa.xes", return_legacy_log_object=True)
process_tree = inductive_miner.apply(log, parameters={"disable_fallthroughs": True})
print(process_tree)
print(len(str(process_tree)))

ppfeiff commented 1 year ago

Hi,

thanks for implementing! Unfortunately, we still have the same issue. There are still two differences in the petri nets on pm4py 2.7.7 and python 3.11. One is shown below where we once have a AND and once a XOR between t1 and t2.

1 2

In the other situation later in the net there are two different constructs of silent transitions pattern.

fit-alessandro-berti commented 1 year ago

Dear @ppfeiff

It is always going to be undetermistic with the fallthroughs. If you would like a more stable model, you shall disable the fallthroughs of the inductive miner as in the example.

ppfeiff commented 1 year ago

Hi,

we just double checked it. We definitely have set disable_fallthoughs to True, the petri net as well as the process tree look a bit different now. But there are still the differences as mentioned in the previous post.

ppfeiff commented 11 months ago

Hi @fit-alessandro-berti, I cannot reopen this issue but the problem still exists (see previous post with figures), although the fallthoughs have been disabled.