ropensci / git2rdata

An R package for storing and retrieving data.frames in git repositories.
https://ropensci.github.io/git2rdata/
GNU General Public License v3.0
99 stars 12 forks source link

Problem with special characters in variable names #67

Closed florisvdh closed 2 years ago

florisvdh commented 2 years ago

AFAIK this type of variable names is not supported by data.frame() but it is supported by tibble(). Does git2rdata intend to support this as well?

library(tibble)
library(git2rdata)
df <- tibble(stat = 1:10, 
             `(stat≤5)` = stat <= 5)
df
#> # A tibble: 10 × 2
#>     stat `(stat≤5)`
#>    <int> <lgl>     
#>  1     1 TRUE      
#>  2     2 TRUE      
#>  3     3 TRUE      
#>  4     4 TRUE      
#>  5     5 TRUE      
#>  6     6 FALSE     
#>  7     7 FALSE     
#>  8     8 FALSE     
#>  9     9 FALSE     
#> 10    10 FALSE
write_vc(df, "df", sorting = "stat")
#> 6bd44e3ff5dcc579436e7eb1dce3139f7e2c8981 
#>                                 "df.tsv" 
#> 64985ae84c4de890de75969a886e1c1172a87b2f 
#>                                 "df.yml"
read_vc("df")
#> Error: Corrupt data, incorrect header. Expecting: stat   (stat≤5)

Created on 2021-09-29 by the reprex package (v2.0.1)

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.1.1 (2021-08-10) #> os Linux Mint 20 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2021-09-29 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> ! package * version date lib source #> P assertthat 0.2.1 2019-03-21 [?] CRAN (R 4.1.0) #> P cli 3.0.1 2021-07-17 [?] CRAN (R 4.1.1) #> P crayon 1.4.1 2021-02-08 [?] CRAN (R 4.1.0) #> P digest 0.6.27 2020-10-24 [?] CRAN (R 4.1.0) #> P ellipsis 0.3.2 2021-04-29 [?] CRAN (R 4.1.0) #> P evaluate 0.14 2019-05-28 [?] CRAN (R 4.1.0) #> P fansi 0.5.0 2021-05-25 [?] CRAN (R 4.1.0) #> P fastmap 1.1.0 2021-01-25 [?] CRAN (R 4.1.0) #> P fs 1.5.0 2020-07-31 [?] CRAN (R 4.1.0) #> P git2r 0.28.0 2021-01-10 [?] CRAN (R 4.1.0) #> P git2rdata * 0.3.1 2021-01-21 [?] CRAN (R 4.1.0) #> P glue 1.4.2 2020-08-27 [?] CRAN (R 4.1.0) #> P highr 0.9 2021-04-16 [?] CRAN (R 4.1.0) #> P htmltools 0.5.2 2021-08-25 [?] CRAN (R 4.1.1) #> P knitr 1.33 2021-04-24 [?] CRAN (R 4.1.0) #> P lifecycle 1.0.0 2021-02-15 [?] CRAN (R 4.1.0) #> P magrittr 2.0.1 2020-11-17 [?] CRAN (R 4.1.0) #> P pillar 1.6.2 2021-07-29 [?] CRAN (R 4.1.1) #> P pkgconfig 2.0.3 2019-09-22 [?] CRAN (R 4.1.0) #> P reprex 2.0.1 2021-08-05 [?] CRAN (R 4.1.1) #> P rlang 0.4.11 2021-04-30 [?] CRAN (R 4.1.0) #> P rmarkdown 2.10 2021-08-06 [?] CRAN (R 4.1.1) #> P rstudioapi 0.13 2020-11-12 [?] CRAN (R 4.1.0) #> P sessioninfo 1.1.1 2018-11-05 [?] CRAN (R 4.1.0) #> P stringi 1.7.4 2021-08-25 [?] CRAN (R 4.1.1) #> P stringr 1.4.0 2019-02-10 [?] CRAN (R 4.1.0) #> P tibble * 3.1.4 2021-08-25 [?] CRAN (R 4.1.1) #> P utf8 1.2.2 2021-07-24 [?] CRAN (R 4.1.1) #> P vctrs 0.3.8 2021-04-29 [?] CRAN (R 4.1.0) #> P withr 2.4.2 2021-04-18 [?] CRAN (R 4.1.0) #> P xfun 0.25 2021-08-06 [?] CRAN (R 4.1.1) #> P yaml 2.2.1 2020-02-01 [?] CRAN (R 4.1.0) #> #> [1] /media/floris/DATA/PROJECTS/09685_NatuurlijkMilieu/160 Bewerkingen en resultaat/Repos_en_data/n2khab-mne-design_withref_groundwater/110_design_groundwater/020_design_elaborate/renv/library/R-4.1/x86_64-pc-linux-gnu #> [2] /tmp/RtmpiADGZi/renv-system-library #> [3] /usr/lib/R/library #> #> P ── Loaded and on-disk path mismatch. ```
ThierryO commented 2 years ago

read_vc() returns a data.frame. How can we return a data.frame with variable names that are not supported by data.frame?

florisvdh commented 2 years ago

Right, I see. I'll close the issue since it's out of current scope. It would actually boil down to adding support for tibbles in read_vc() - perhaps by using readr::read_table() instead of read.table() but that's an extra dependency. Also, such migration may be rather involved. Probably easier for now to not use special variable names...

ThierryO commented 2 years ago

For a discussion on using tibblesee https://github.com/ropensci/software-review/issues/263