pm4py / pm4py-core

Public repository for the PM4Py (Process Mining for Python) project.
https://pm4py.fit.fraunhofer.de
GNU General Public License v3.0
722 stars 286 forks source link

convert_to_event_log: "the case ID column should be of type string" #370

Closed excubo-jg closed 1 year ago

excubo-jg commented 1 year ago

I am getting the above error after upgrading to 2.3.4 from 2.2.29 when calling pm4py.convert_to_event_log(dataframe)

The first column in the dataframe is of type 'int' - which has not been a problem with 2.2.29. Changing the type to 'str' with .astype(str) does not fix the problem.

fit-alessandro-berti commented 1 year ago

Dear @excubo-jg

Since the Pandas operations are not in-place, you should operate as in the following example to ensure the string type:

import pandas as pd import pm4py

dataframe = pd.read_csv("tests/input_data/running-example.csv") dataframe["time:timestamp"] = pd.to_datetime(dataframe["time:timestamp"]) dataframe["case:concept:name"] = dataframe["case:concept:name"].astype(str) log = pm4py.convert_to_event_log(dataframe)

excubo-jg commented 1 year ago

Dear Alessandro, many thanks for coming back so swiftly. Actually, I got thrown a curve ball by the error message as I have an other column labelled ID and converted that to str... Ouch. I'd appreciate if the error message could also make reference to the column name "case:concept:name" PS: Since a couple of days I cannot reach pm4py.fit.fraunhofer.de - it seems the site is down

fit-alessandro-berti commented 1 year ago

Dear @excubo-jg the website is working again, thanks for signaling