Open kaz462 opened 10 months ago
Dear developer,
I found that the label length that must be <= 40 is just used for the xpt 5, if the version = 8
the label should be up to 256.
Hi both, May I confirm a question about data conversion. I am trying to convert the .rda file to .sas7bdat, and it seems the "write_xpt" doesn't work as expected. The created sas7bdat file cannot be opened, it always shows "file ... is not a SAS data set".
I saw some discussion about this issue and doesn't found a good solution.
What is the recommended method for converting the .rda file to a .sas7bdat file?
It seems that "write_xpt" works well when converting to an .xpt file. Should I first convert the file to an xpt format and then change it to a .sas7bdat file using SAS? Are there any potential risks associated with this approach?
Looking forward to leanring the insights from your valuable experience. Many thanks!
Hi @botsp It seems that write_xpt()
may only support the xpt creation.
write_sas()
creates sas7bdat files. Unfortunately the SAS file format is complex and undocumented, so write_sas() is unreliable and in most cases SAS will not read files that it produces.write_xpt()
writes files in the open SAS transport format, which has limitations but will be reliably read by SAS.
For sas7bdat
, I use the same way you mentioned, that is creating the xpt first by R then coverting to sas7bdat by SAS.
After converting, I compared results from write_xpt()
with SAS datasets directly created by SAS, there is no difference except the variable length.
Thanks for your explanation and this inspire me about the method of sas data conversion. Thank you!
Hi @kaz462 and @ynsec37,
Thanks for the feedback! This is an issue with our dataset label validation code, and the documentation could be clearer - the dataset label for XPT files is a maximum of 40 bytes rather than characters. Our validation code is currently checking with the default type = "chars"
and should be updated to type = "bytes"
.
@ynsec37 note that the XPT documentation shared above is referring to the variable label length. Although variable labels can be longer in version 8 the maximum dataset label length is still 40 bytes.
From
write_xpt
documentation:The following dataset label in Chinese has 40 characters and was truncated after
write_xpt
.(thanks @siye6 for the original example in https://github.com/atorus-research/xportr/pull/194)
Created on 2023-12-13 with reprex v2.0.2