pm4py / pm4py-core

Public repository for the PM4Py (Process Mining for Python) project.
https://pm4py.fit.fraunhofer.de
GNU General Public License v3.0
722 stars 286 forks source link

alpha miner factory not compatible with data frame #86

Closed s-j-v-zelst closed 5 years ago

s-j-v-zelst commented 5 years ago

Alpha miner factory is not compatible when a data frame is given. In such a case, the variant is ignored (plus versus classical), i.e., the classical version is always called.

Javert899 commented 5 years ago

Thanks for signaling!

Javert899 commented 5 years ago

Seemingly debunked too.

Placing an "AAAA" inside the factory of the Alpha Miner, when the dataframe is checked, and using the following script:

from pm4py.objects.log.adapters.pandas import csv_import_adapter from pm4py.algo.discovery.alpha import factory as alpha_miner

df = csv_import_adapter.import_dataframe_from_path("C:\running-example.csv") print(type(df)) alpha_miner.apply(df)

I got:

<class 'pandas.core.frame.DataFrame'> AAAA

s-j-v-zelst commented 5 years ago

That is not the problem I was referring to. The problem is the following:

if isinstance(log, pandas.core.frame.DataFrame):
    dfg = df_statistics.get_dfg_graph(log, case_id_glue=parameters[pmutil.constants.PARAMETER_CONSTANT_CASEID_KEY],
                                      activity_key=parameters[pmutil.constants.PARAMETER_CONSTANT_ACTIVITY_KEY],
                                      timestamp_key=parameters[pmutil.constants.PARAMETER_CONSTANT_TIMESTAMP_KEY])
    return VERSIONS_DFG[variant](dfg, parameters=parameters)

The factory will use the DFG only version, for which the PLUS version is undefined.

If this is for performance reasons, I propose to do the following: When we have a dataframe and the version is classic, use the DFG only, if the version is PLUS, we use a converter and then route to the PLUS