Closed lacava closed 4 years ago
instance still appears in the pandas profiling report
You can see the updated report currently on gh-pages here. Even though everything was deployed correctly, the site is not being built. I think we have reached the repo size limits.
I'll troubleshoot this, but I think we need to think long term about tracking with LFS and other ways to reduce the repo size.
target description is wrong; 0 corresponds to promoters. these labels might need to be flipped to match source.
Did you mean this? Isn't this correct?
description: Positive class indicates a promoter. code: -:1, +:0 (promoter)
In PMLB, 0 indicates a promoter. In the original data, class labels were (+,-), with + indicating promoter. It seems more in line with that encoding to have "1" indicate a promoter. In either case the description in the metadata is wrong, positive class in PMLB indicates NOT a promoter.
Hmm not sure if I'm missing something completely here...
Currently, in PMLB, 0 = + = promoter
(both metadata and data).
In the original data, + = promoter
.
I agree it's more conventional to encode 1 as promoter. We can flip that. But I don't think what we have currently is wrong.
The description in metadata.yaml reads "Positive class indicates promoter". While this is true for the source data, for the PMLB data, 0 indicates promoter. So I'm suggesting we update the description.
Ah, I read this "positive" as literal "+". Let's go ahead and recode the target then.