In our research group, we store a significant amount of data in SPSS's SAV format. The format is advantageous due to its comprehensive metadata handling capabilities, which include variable labels, value labels, missing value definitions, and multiple response sets, among others. However, this format poses significant challenges when we attempt to utilize other programming languages or tools for data analysis and manipulation.
To increase interoperability and efficiency within our group, we're exploring open-source, platform/language-agnostic formats similar to the SAV format, specifically those capable of storing complex metadata.
We have been testing Frictionless, but we find the standard somewhat lacking in terms of support for the complex metadata available in the SPSS's SAV format (and even SAS's sas7bdat/sas7bcat format too).
Currently, we manually extract all this information and store it in CSV files, but this seems like a task that a framework like Frictionless should handle seamlessly.
We would greatly appreciate it if Frictionless could enhance support for reading all the metadata available in the SAV format. In particular, it would be beneficial if it could apply formatting options to data specified in the metadata.
This includes, but is not limited to:
In addition to reading variable labels (field descriptions), also read value labels (important for categorical variables, e.g., questionnaire responses)
Formatting numeric values according to their specified formatting options when reading (e.g., read numeric values with no decimals as integers and not floats)
Ensuring dates are not returned as epoch time
Be able to seamlessly "apply" value labels when displaying data
This link provides a comprehensive guide to the different types of metadata possible to specify in the "Variable view" in SPSS.
We believe that such enhancements would not only benefit our research group but also other users who work with similar data formats. We look forward to seeing these improvements in future versions of Frictionless.
In our research group, we store a significant amount of data in SPSS's SAV format. The format is advantageous due to its comprehensive metadata handling capabilities, which include variable labels, value labels, missing value definitions, and multiple response sets, among others. However, this format poses significant challenges when we attempt to utilize other programming languages or tools for data analysis and manipulation.
To increase interoperability and efficiency within our group, we're exploring open-source, platform/language-agnostic formats similar to the SAV format, specifically those capable of storing complex metadata.
We have been testing Frictionless, but we find the standard somewhat lacking in terms of support for the complex metadata available in the SPSS's SAV format (and even SAS's sas7bdat/sas7bcat format too).
Currently, we manually extract all this information and store it in CSV files, but this seems like a task that a framework like Frictionless should handle seamlessly.
We would greatly appreciate it if Frictionless could enhance support for reading all the metadata available in the SAV format. In particular, it would be beneficial if it could apply formatting options to data specified in the metadata.
This includes, but is not limited to:
This link provides a comprehensive guide to the different types of metadata possible to specify in the "Variable view" in SPSS.
We believe that such enhancements would not only benefit our research group but also other users who work with similar data formats. We look forward to seeing these improvements in future versions of Frictionless.