Open ueliwechsler opened 1 year ago
This is an intentional design choice in order support passing tuples of values.
# From https://github.com/invenia/Impute.jl/blob/master/test/imputors/replace.jl#L23
imp = Replace(; values=(DateTime(0), -9999, NaN, ""))
df_table = DataFrame(
:time => [missing, DateTime(2020, 02, 02), DateTime(2121, 12, 12)],
:loc => [12, -5, missing],
:val => [1.5, missing, 3.0],
:desc => ["foo", "bar", missing],
)
df_expected = DataFrame(
:time => [DateTime(0), DateTime(2020, 02, 02), DateTime(2121, 12, 12)],
:loc => [12, -5, -9999],
:val => [1.5, NaN, 3.0],
:desc => ["foo", "bar", ""],
)
Simply adding a type conversion would break this functionality and adding a warning would be very noisy. My only thought is that you could probably use typejoin
and either iteratively/recursively call supertype
to identify the closest match. Though I don't think the added complexity and overhead would be worth it.
Thank you for the explanation. My example was only considering replacing the values for one column. In the more general context of replacing values for more than one column, the design makes sense to me.
Would you mind me creating a PR to update the documentation such that the example shows a case where the values are not replaced and maybe add a caveat box to highlight the behavior?
I was struggle a bit with the
Impute.replace
functionality, as at first I did not realize that replacing0
is not the same as replacing0.0
.See the following minimal example:
I know it is written in the docs
But it does not feel intuitive to me to not replace the value (and not provide a warning), especially if it could be directly converted into the type of column values.
Is there something that could be done to not "silently fail" if the value is not replaced?