Closed FelixNeutatz closed 5 years ago
Hi Felix,
Thanks for your good comments.
Again, thank you for feedback.
Kind regards, Mohammad
Sounds great :)
I had another idea: It would be also great to provide optimal parameter configurations for well known datasets, such as Hospital, Flights, ... for each supported tool.
E.g. the corresponding functional dependencies for Nadeef for Hospital, the best performing parameters for each dBoost algorithm ...
As we discussed, calculation of effectiveness measures are added.
Hi,
I really like the idea of this project and I also have a couple of ideas how to extend this library.
During my work on my thesis I recognized that calculating the F1-score / precision / recall given the ground-truth is not always that trivial and it would be great if you could add this function to the library.
So, the idea is that instead of only providing the library with the dirty dataset, we additionally provide it with the clean ground truth as well. If there is ground truth available the library would return all metrics that have been requested, such as F1-score / precision / recall ...
Another idea I had is to think about whether we provide some process for hyperparameter optimization for quantitative methods, such as dBoost.
It would be also great if there would be a short tutorial how to call the librarie's Python API for each of the supported tools. For example, I do not fully understand how the OpenRefine Tool is implemented.
Again, thanks for this great project.
Best regards, Felix