Closed iliya-b closed 1 month ago
@vs9h I've fixed the architectural issues with Tane and PFDTane algorithms as you suggested in PR #300
@vs9h I've fixed the issues with this PR. You mentioned another PR #396 , but that PR is still a draft and it rather introduces a few performance enhances into the algorithm and does not affect the architecture. The current PR blocks some other PRs, that's why I've kept only changes that are related to this PR (refactoring) for this moment. What do you think?
Also, split commits into at least two (tests in a separate commit)
@vs9h I've fixed these issues.
Generalize Tane and PFDTane, add additional tests.
In order to check if the refactoring caused any performance loss, following experiments were performed. The discovery task was run as
cli.py --task=afd --algo=tane --error=0.05 --table=...
with new and original versions of TANE implementation. Following heavy datasets were utilized: EpicMeds.csv, adult.csv, EpicVitals.csv.Following list demonstrates measured running time of the old and new algorithms, correspondingly (confidence intervals of 95%, with 10 iterations):