Open elimillera opened 8 months ago
I am adding an example for more clarification.
The XPT requirements and those from regulatory agencies can differ. For instance, let's examine the distinct requirements for dataset and variable labels:
XPT | FDA | NMPA |
---|---|---|
No restriction on characters; maximum length is 40 bytes. | Variable names, as well as variable and dataset labels, should include American Standard Code for Information Interchange (ASCII) text codes only. Maximum Length in Characters = 40 | For eSubmission in China, one of the requirements is to translate the foreign language data package (e.g., English) to Chinese. Variable labels, dataset labels, MedDRA, WHO Drug terms, primary endpoint-related code lists, etc., need to be translated from English to Chinese. |
Currently, in df_label.R
, the function fails if the label does not meet the following requirements:
label_len <- nchar(label)
if (label_len > 40) {
abort("Length of dataset label must be 40 characters or less.")
}
if (stringr::str_detect(label, "[^[:ascii:]]")) {
abort("`label` cannot contain any non-ASCII, symbol, or special characters.")
}
The first check represents an XPT requirement, while the second one aligns with FDA specifications. I suggest moving agency-specific checks to xpt_validate
so that they can be ignored if necessary.
Feature Idea
This was brought up in the Dec122023 meeting. There are different rules for different agencies. For example, FDA doesn't allow underscores or non-ascii in filenames. We could add a flag to
strict_checks
inwrite_xpt
to check agency specific rules.@cpiraux Feel free to add in anything I missed or misstated.
Relevant Input
No response
Relevant Output
No response
Reproducible Example/Pseudo Code
No response