Open BSchilperoort opened 2 years ago
Based on the discussion in issue #71, we will only provide iterator for the user to walk through all the splits. They have the flexibility to perform RGDR (or even complete ML workflow). We can further discuss whether we need a function to do "grouping over splits". But at least we can provide a notebook to show this as a usecase.
Due to computational limits (applying DBSCAN for every individual train/test split might not be viable), we want to allow users to be able to 'grouping' splits in RGDR before calculating the DBSCAN clusters.
To do this we need to go through the following steps:
np.any
cluster_labels[~split_mask] = 0.0
)This way we end up with clusters for each split, with aligned split labels.