IQSS / dataverse-client-r

R Client for Dataverse Repositories
https://iqss.github.io/dataverse-client-r
61 stars 24 forks source link

Support `vars` option in get_file/get_dataframe #79

Open kuriwaki opened 3 years ago

kuriwaki commented 3 years ago

vars should be an argument that subsets the columns of the dataset to pull. However, it seems to not affect anything and just returns the whole dataset.

library(dataverse)

df_tab_all <-
  get_file_by_name(
    filename = "roster-bulls-1996.tab",
    dataset  = "doi:10.70122/FK2/HXJVJU",
    server   = "demo.dataverse.org"
  )

df_tab_vars <-
  get_file_by_name(
    filename = "roster-bulls-1996.tab",
    dataset  = "doi:10.70122/FK2/HXJVJU",
    server   = "demo.dataverse.org",
    vars = c("number", "player") # only two columns
  )

# first data should be larger (more data)
stopifnot(object.size(df_tab_all) > object.size(df_tab_vars))
#> Error: object.size(df_tab_all) > object.size(df_tab_vars) is not TRUE

# does it work on get_dataframe?
df_tab_vars <-
  get_dataframe_by_name(
    filename = "roster-bulls-1996.tab",
    dataset  = "doi:10.70122/FK2/HXJVJU",
    server   = "demo.dataverse.org",
    vars = c("number", "player") # only two columns
  )
#> Downloading ingested version of data with readr::read_tsv. To download the original version and remove this message, set original = TRUE.
#> Rows: 15 Columns: 9
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (6): player, position, height, dob, country_birth, college
#> dbl (3): number, weight, experience_years
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ncol(df_tab_vars)
#> [1] 9

Created on 2022-01-12 by the reprex package (v2.0.1)

EDITED 2021-01-12 with new version of dataverse, which now avoids errors and fixes the reprex.