fslaborg / Deedle

Easy to use .NET library for data and time series manipulation and for scientific programming
http://fslab.org/Deedle/
BSD 2-Clause "Simplified" License
929 stars 196 forks source link

mapValues(f) after pivotTable doesn't yield the same result as using f as operation of pivotTable #352

Closed Kimserey closed 6 years ago

Kimserey commented 8 years ago

Using id as the op of Frame.pivotTable followed by Frame.mapValues with an aggregate function - Frame.countRows in my example - does not yield the same result as using the same aggregate function as the op of Frame.pivotTable.

This code:

[ "r1" => series [ "c1" => "a"; "c2" => "a"; "label" => "good" ]
  "r2" => series [ "c1" => "a"; "c2" => "b"; "label" => "bad" ] ]
|> Frame.ofRows
|> Frame.pivotTable
    (fun _ c -> c.GetAs<string>("c1"))
    (fun _ c -> c.GetAs<string>("label"))
    id
|> Frame.mapValues Frame.countRows

Gives the unexpected result:

      good                                                               bad                                         
a -> Deedle.Frame`2[System.String,System.String] Deedle.Frame`2[System.String,System.String] 

While this code:

[ "r1" => series [ "c1" => "a"; "c2" => "a"; "label" => "good" ]
  "r2" => series [ "c1" => "a"; "c2" => "b"; "label" => "bad" ] ]
|> Frame.ofRows
|> Frame.pivotTable
    (fun _ c -> c.GetAs<string>("c1"))
    (fun _ c -> c.GetAs<string>("label"))
    Frame.countRows

Give the expected result:

      good bad 
a -> 1     1   

Am I missing anything? Why do the results differ?

Kimserey commented 8 years ago

Specifying the type in mapValues fixes it as well:

[ "r1" => series [ "c1" => "a"; "c2" => "a"; "label" => "good" ]
  "r2" => series [ "c1" => "a"; "c2" => "b"; "label" => "bad" ] ]
|> Frame.ofRows
|> Frame.pivotTable
    (fun _ c -> c.GetAs<string>("c1"))
    (fun _ c -> c.GetAs<string>("label"))
    id
|> Frame.mapValues (fun (frame: Frame<string, string>) -> Frame.countRows frame)
      good bad 
a -> 1     1