Open SamGG opened 3 years ago
Thanks for pointing this out Samuel! Your help is greatly appreciated. I'll use your commit in the next update of the package. Please do continue to share and suggest any other improvements you spot!
Thanks for your feedback. I have no time yet, and I only wanted to try the PPS idea on a dataset. I like this approach and I will come back later.
From my point of view, the factor transformation is handling the levels of y and yhat independently, which is incorrect. Could you check the F1 calculation and my commit https://github.com/paulvanderlaken/ppsr/commit/37c96920883138fcb60cf9ca3afe1f3c7ee469f2?
I think there should be a test case for F1 calculation
The Titanic dataset is interesting for tracking various combinations of variable type. I have no time to work on it now, but it might be included in the package as a demo file. I think there might be a problem with the TicketID variable as it has many levels, but I didn't how this handled in the Python code.
Best.