tidyverse / haven

Read SPSS, Stata and SAS files from R
https://haven.tidyverse.org
Other
424 stars 117 forks source link

Unable to import custom missing data values from .dta #763

Closed vas235 closed 1 month ago

vas235 commented 1 month ago

The package is currently unable to import missing data values from NSCH datasets which are available in .dta and .sas7bdat formats. The missing data values are [.l, .n, .m, .d]. They are currently imported as NA which prevents us from re coding specific categories of missing data values. Current workaround is to take the data to another language or software and re-export with numeric missing data values.

gorcha commented 1 month ago

Hi @vas235, thanks for the bug report.

Can you please provide a minimal reprex (reproducible example)? The goal of a reprex is to make it as easy as possible for me to recreate your problem so that I can fix it: please help me help you! If you've never heard of a reprex before, start by reading about the reprex package, including the advice further down the page. Please make sure your reprex is created with the reprex package as it gives nicely formatted output and avoids a number of common pitfalls.

Also please see the tagged_na() documentation - user defined missing values should be read in from SAS and Stata using this mechanism, it can be easy to miss if you're not familiar with it.

Thanks!

vas235 commented 1 month ago

Thank you for your help. I completely misunderstood the actual use for the tagged_na() function, and i was also trying to search the codes as they are in the census codebooks: [ ".l", ".n", ".m", ".d" ]

using the function correctly there is no issue, and i can id the char versions of the codes, [ "l", "m", "n", "d" ]