filter_variants_top_k relies on the event_log dataframe being sorted by case and timestamp (see pm4py.objects.log.util.pandas_numpy_variants.apply(), which is called to retrieve variants to filter) but this is not documented anywhere.
I am not sure if this affects other filters. I am also not sure if this behaviour is intended (should be documented) or not (should be fixed).
Yes, to compute variants more efficiently, we need consecutive events of the same case to be consecutive also in the dataframe. So the behavior is intended
filter_variants_top_k
relies on the event_log dataframe being sorted by case and timestamp (seepm4py.objects.log.util.pandas_numpy_variants.apply()
, which is called to retrieve variants to filter) but this is not documented anywhere.I am not sure if this affects other filters. I am also not sure if this behaviour is intended (should be documented) or not (should be fixed).