pm4py / pm4py-core

Public repository for the PM4Py (Process Mining for Python) project.
https://pm4py.fit.fraunhofer.de
GNU General Public License v3.0
702 stars 277 forks source link

[Feature] XES-YAML - XES format extension #472

Open ondrisvu opened 6 months ago

ondrisvu commented 6 months ago

Description: This PR introduces XES-YAML functionality, which serves as an extension for the supported XES log formats. The backward-compability with the existing pm4py’s implementation of the XES importer/exporter remains. This implementation was developed as part of a Bachelor's Thesis titled "More Efficient eXtensible Event Stream-XES-YAML-Format" at TUM.



Key Additions:

  1. XES-YAML Import/Export
    • read_yaml() function to parse XES-YAML files into pm4py’s native XES structure
    • write_yaml() function to serialize data into standards-compliant XES-YAML
  2. Illustrative Example
    • a sample event log in the XES-YAML log format, demonstrating:
      • representation of log, trace, event, and attribute structures
      • the sample consists of snippets from real-life .xes.yaml event logs
  3. Comprehensive Test Suite
    • conversion tests (correctness assurance), covering:
      • XES-XML to XES-YAML
      • XES-YAML to XES-XML
      • The aforementioned conversion tests ensure data integrity and format preservation across various conversions.
    • roundtrip tests
      • XES-YAML to XES-YAML
      • Roundtrip tests ensure log consistency between logs before/after export.

Backward Compability Focus:

Used libraries:

fit-alessandro-berti commented 6 months ago

Thanks. This is a contribution that from what I understood is going to be part of a larger set of PRs.

We would need to finalize a "contribution level agreement" before merging (as the contribution is non-trivial)