tidyverse / haven

Read SPSS, Stata and SAS files from R
https://haven.tidyverse.org
Other
424 stars 117 forks source link

Length, Type and other properties are missing when using haven::read_sas #718

Open dgyurko opened 1 year ago

dgyurko commented 1 year ago

When reading any sas7bdat file via haven::read_sas, the only metadata property returned is the "Format".

pyreadstat, a python library that is also based on ReadStat has this feature

e.g.: the "Length" property is returned as variable_storage_width: a dict with keys being variable names and values being the storage width

# Download https://github.com/tidyverse/haven/blob/main/inst/examples/iris.sas7bdat

sas <- haven::read_sas(data_file = "iris.sas7bdat")
lapply(sas, attributes)

# Output
# $Sepal_Length
# $Sepal_Length$format.sas
# [1] "BEST"
# 
# 
# $Sepal_Width
# $Sepal_Width$format.sas
# [1] "BEST"
# 
# 
# $Petal_Length
# $Petal_Length$format.sas
# [1] "BEST"
# 
# 
# $Petal_Width
# $Petal_Width$format.sas
# [1] "BEST"
# 
# 
# $Species
# $Species$format.sas
# [1] "$"

Expected result: The SAS metadata properties: "Label", "Length", "Type", "Format", Informat returned as attributes for each column Actual result: Only "Format" is returned as an attribute

kaz462 commented 1 year ago

Thanks for initiating this issue @dgyurko Similar for haven::read_xpt, the metadata properties are kept for class, format.sas, label, but not for length and type

bms63 commented 1 year ago

This would be very nice update!! We use this in the xportrpackage - as the pharma industry has to deliver xpts to Health Authorities. :(