fslaborg / Deedle

Easy to use .NET library for data and time series manipulation and for scientific programming
http://fslab.org/Deedle/
BSD 2-Clause "Simplified" License
924 stars 196 forks source link

Frame.ofRecords does not unwrap Option types #522

Closed 1ndy closed 3 years ago

1ndy commented 3 years ago

I am trying to create a data frame from a sequence of records. The records are created from data in an excel spreadsheet that may have missing values. Consequently, every field in the record is an option type. None types show up as <missing> as expected.

type observation = {
        name: Option<string>
        age: Option<int>
        weight: Option<float>
        married: Option<bool>
    }

let createDataFrame (sheet: ISheet) =
        seq { for i in 1..sheet.LastRowNum -> sheet.GetRow(i) }
        |> Seq.map createObservation
        |> Frame.ofRecords

When I print out the result, I get rows like this:

[ name => Some(alice); age => Some(32); weight => <missing>; married => Some(False)]
[ name => Some(bob); age => Some(13); weight => Some(110); married => Some(False)]

Fields that did have values show up in the data frame as an Option type. This makes it impossible to use statistical functions on the data and seems pretty useless in general. Is there a reason someone would want to keep values as option types in the data frame? Why not unwrap these when processing the records and store them as usable types?

1ndy commented 3 years ago

I found the (|Missing|Present|) active pattern which solves my problem.