Split data checks based on agency

I am adding an example for more clarification.

The XPT requirements and those from regulatory agencies can differ. For instance, let's examine the distinct requirements for dataset and variable labels:

XPT	FDA	NMPA
No restriction on characters; maximum length is 40 bytes.	Variable names, as well as variable and dataset labels, should include American Standard Code for Information Interchange (ASCII) text codes only. Maximum Length in Characters = 40	For eSubmission in China, one of the requirements is to translate the foreign language data package (e.g., English) to Chinese. Variable labels, dataset labels, MedDRA, WHO Drug terms, primary endpoint-related code lists, etc., need to be translated from English to Chinese.

Currently, in df_label.R, the function fails if the label does not meet the following requirements:

label_len <- nchar(label)

if (label_len > 40) {
  abort("Length of dataset label must be 40 characters or less.")
}

if (stringr::str_detect(label, "[^[:ascii:]]")) {
  abort("`label` cannot contain any non-ASCII, symbol, or special characters.")
}

The first check represents an XPT requirement, while the second one aligns with FDA specifications. I suggest moving agency-specific checks to xpt_validate so that they can be ignored if necessary.

atorus-research / xportr

Split data checks based on agency #201

Feature Idea

Relevant Input

Relevant Output

Reproducible Example/Pseudo Code