Closed rvandewater closed 1 year ago
@mlondschien it seems after testing that the order is preserved at this step (also owing to the sort=False). Let me know if your experience is different.
IIUC sort=False
results in pandas not grouping by the groupby-keys. My concern was related to the order within the groups, as you are extracting the "last" element via .last()
. Is the "last" row within a group always equal to the row with maximal charttime
?
The features_df should be ordered by time, yes. There might be a better way of doing it, but with the following command, it extracts the last row, as you can see when using count historical feature generation:
train
-d
demo_data/mortality24/mimic_demo
-t
BinaryClassification
--log-dir
../yaib_logs/mortality
--tune
-m
LGBMClassifier
-s
1111
--checkpoint
test
Perhaps I could put a check in there to assure the order is increasing within group before the time column is removed.
Time order should preserved for traditional ML methods: https://github.com/rvandewater/YAIB/blob/8a56504b8be2ce05da4fedac8116084779cabd56/icu_benchmarks/data/loader.py#L147