Closed mtreske closed 3 years ago
Could you provide a minimal repro of sample frame
or series
?
The following script works.
let sr = series ["A", 0.; "B", nan]
val it : Series<string,float> =
A -> 0
B -> <missing>
sr.FillMissing(0.)
val it : Series<string,float> =
A -> 0
B -> 0
what happens if you call sr.FillMissing(nan) like we do sr.FillMissing(double.NaN) or sr.FillMissing(0d/0d)?
The following is the expected behavior. What is your sample series that does not behave?
series [1, 1.; 2, nan]
val it : Series<int,float> =
1 -> 1
2 -> <missing>
(series [1, 1.; 2, nan]).FillMissing(nan)
val it : Series<int,float> =
1 -> 1
2 -> <missing>
we would expect a behaviour like this. this is like it behaved until Version 2.0.0.
val it : Series<int,float> =
1 -> 1
2 -> <missing>
it.FillMissing(nan)
val it : Series<int,float> =
1 -> 1
2 -> nan
Since Version 2.0.1 it behaves like this
it.FillMissing(nan)
val it : Series<int,float> =
1 -> 1
2 -> <missing>
We are using FillMissing(nan) extensively. So at the moment this looks like a showstopper for us.
i found this https://github.com/fslaborg/Deedle/issues/446 issue where you mentioned that the FillMissing((x) => double.NaN) is the correct way to replace the missing values with NaN. Is this the way it should work?
That was indeed a bug in #446 that was fixed then. According to the original design notes https://fslab.org/Deedle/design.html, nan
should be treated as missing
value.
The current version treats certain values as "missing" values, including Double.NaN (for numeric values) and null (for Nullable<'T> types and reference types). This means that when you create a series from Double.NaN, this is turned into a missing value and the value is skipped when doing aggregation such as Series.sum
I believe there must be a way to solve your problem. Could you share what are you trying to achieve after filling missing with nan
?
for example : our BL apps try to read the series and series[n] throws an MissingValueException when the value is equal to <missing>
.
If you are writing C#, read the docs here https://fslab.org/Deedle/csharpseries.html#Lookup-by-key-and-index. You need to call TryGet
or TryGetAt
for series that has missing values.
this is to much effort. We are not able to do all the changes at the moment. We are counting on FillMissing(nan).
Sorry that your production code relies on an old buggy behavior. As I said before, it's by design that nan
shall be treated as missing value. Using TryGet
or TryGetAt
shall solve your problem.
A minimal change is to have a wrapper in your code to convert all the missing OptionalValue
to nan
. For instance, something like this
let tryGetConvertNan (n:'K) (sr:Series<'K, float>) =
let value = sr.TryGet(n)
if value.HasValue then value.Value else nan
let sr = series [1, 1.; 2, nan]
sr |> tryGetConvertNan 2
i had a look at the design notes. As far as i understand the handling of nan as missing value is based in the BL in some aggregate functionslike series.sum. It seems that handling nan as a missing value was implemented to simplify the BL of these functions. For me there is a difference between nan and <missing>
. The alternative mentioned in the design notes is to support both nan, and <missing>
in the series.sum method without a clear solution...and i would prefer this solution.
The original design notes did offer an alternative opinion about treating nan
differently from missing
. But the current implementation of treating nan
as missing
was a conscious choice if you look at all the unit test cases involving nan
such as these https://github.com/fslaborg/Deedle/blob/master/tests/Deedle.Tests/Series.fs#L40
Please take some time to modify your code to use TryGet
or TryGetAt
to avoid exceptions. That's a common cost of adopting an open source library when meeting breaking changes (in this case, a bug fixed) during upgrading versions. We've all been through that experience a lot. I can hear your pain points.
I'll close this issue as it's by design.
We are actually using Deedle 1.2.5 where we use the FillMissing() series extension method do fill the missing values with double.NaN. Since we move our libs to NetStandard we need to use version 2.x. I tried 2.03 up to 2.3.0 but the missing value will never be replaced with double.NaN.
any ideas?
thx and Regards Manuel