Closed BSchilperoort closed 1 year ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
SonarCloud Quality Gate failed.
CI seems borked.
https://www.urbandictionary.com/define.php?term=borked
Cool, didn't know that word ^^
This PR implements label alignment over splits into RGDR. The label alignment is aimed at giving similar clusters over different splits the same name, while not changing the actual data (avoiding any train-test leakage).
An example of the final result can be visualized using the following plot:
Label alignment is performed by the user as follows:
s2spy.rgdr.label_alignment.rename_labels(rgdrs, clustered_data)
Wherergdrs
is a list of RGDR objects, andclustered_data
is data that has been clustered using that RGDR object. This can be both the train data as well as the test data.A new notebook has been added, `example_label_alignment.ipnb" which walks through the train test splitting as well as the label alignment steps.