bedatadriven / activityinfo-R

ActivityInfo R Language Client
https://www.activityinfo.org/support/docs/R/
18 stars 12 forks source link

arrange() results in a different order of records if done as a data frame or as a remote records object #90

Open nickdickinson opened 1 year ago

nickdickinson commented 1 year ago

Arrange is limited to a single field and can result in a different order of records when done on the server vs. when done with dplyr::arrange on a data.frame.

records_df <- getRecords("ceam1x8kq6ikcujg") |> select(ends_with("Name")) |>collect()
records_df %>% filter(`Sector Name`=="Nutrition") %>% arrange(`Organization Name`, `Admin 1 Name`, `Admin 2 Name`) %>% slice_head(n=2) 
# ActivityInfo tibble: Remote form: Projects (ceam1x8kq6ikcujg)
# A tibble:            2 x 5
  `Organization Name`          `Sector Name` `Sub-sector Name`   `Admin 1 Name` `Admin 2 Name`
  <chr>                        <chr>         <chr>               <chr>          <chr>         
1 Save the Children in Myanmar Nutrition     IEC on Infant and ~ Rakhine        Sittwe        
2 World Concern Myanmar        Nutrition     Monitoring Breast ~ Kayin          Hpa-An    
remote_records <- getRecords("ceam1x8kq6ikcujg") |> select(ends_with("Name"))
remote_records %>% arrange(`Organization Name`) %>% filter(`Sector Name`=="Nutrition") %>% slice_head(n=2)
Adding filter: (c3g7i69kq6jst8k3z.caxmhjxkq6jqe373c == "Nutrition")
# ActivityInfo tibble: Remote form: Projects (ceam1x8kq6ikcujg)
# A tibble:            2 x 5
  `Organization Name`          `Sector Name` `Sub-sector Name`   `Admin 1 Name` `Admin 2 Name`
  <chr>                        <chr>         <chr>               <chr>          <chr>         
1 Save the Children in Myanmar Nutrition     IEC on Infant and ~ Rakhine        Sittwe        
2 World Concern Myanmar        Nutrition     Nutrition Assessme~ Shan (North)   Lashio 
nickdickinson commented 1 month ago

Double check the documentation clearly explains the order of operations in a lazy dataframe when this is done on the server. It has to do with the chaining and sequence of operations which can be different than expected.