pm4py / pm4py-core

Public repository for the PM4Py (Process Mining for Python) project.
https://pm4py.fit.fraunhofer.de
GNU General Public License v3.0
722 stars 286 forks source link

dataset example data #207

Closed joseberlines closed 3 years ago

joseberlines commented 3 years ago

Hi, I am just amazed by this library and I am starting to learn it thoroughly. The problem is that there is no substantial datasets to work with as examples.

I tried to look for Kaggle datasets of events log processes but nothing there.

Would not be a good idea to include in the repo a couple of data frames ready to experiment with PM4py?

I am talking about real-size data, i.e. a couple of million rows dataset or so.

Or could you add some information about where to download them? thanks.

Erinaceida commented 3 years ago

You can use the MIMIC database. you need to go through an application process, and it requires a bit of cleaning, but it is an option. https://mimic.physionet.org/

s-j-v-zelst commented 3 years ago

some real data sets can be found here: https://data.4tu.nl/search?q=BPI

s-j-v-zelst commented 3 years ago

some other example data sets: https://github.com/pm4py/pm4py-core/tree/release/tests/input_data