RConsortium / submissions-wg

R Submissions Working Group
https://rconsortium.github.io/submissions-wg
48 stars 14 forks source link

Suggest Rewording to Agency of Section 4.1.2.10 #3

Closed mstackhouse closed 3 years ago

mstackhouse commented 3 years ago

This section reads:

4.1.2.10 Software Programs Sponsors should provide the software programs used to create all ADaM datasets and generate tables and figures associated with primary and secondary efficacy analyses. Furthermore, sponsors should submit software programs used to generate additional information included in Section 14 CLINICAL STUDIES of the Prescribing Information30, if applicable. The specific software utilized should be specified in the ADRG. Refer to FDA Statistical Software Clarifying Statement for more information31. The main purpose of requesting the submission of these programs is to understand the process by which the variables for the respective analyses were created and to confirm the analysis algorithms and results. Sponsors should submit software programs in ASCII text format. Executable file extensions should not be used.

Industry has taken this to mean that the files should be renamed from .sas or .r to .txt. That's not the intension of the guidance, but rather that a .exe or binary files should not be submitted.

This group has the opportunity to suggest a rewording of the section to remove this ambiguity.

dgkf commented 3 years ago

Use of Unicode?

Many languages permit a wider range of unicode characters. For example, idiomatic Julia has unicode aliases to set arithmetic (, ), Greek characters for math, or accents used to denote derivations. In these cases, unicode is used to write code that resembles mathematical notation. As far as I know, there isn't a language that requires non-Ascii characters (at least not ones that would be realistically used for a submission).

I would be fine with the ascii restriction, but want to make sure this is a conscious decision when there are languages out there that permit a wider character set which is used for easier interpretation of code.

Suggest using "Binary file formats" instead of "Executable file extensions"

Whether a file is executable is dependent on the host system and any software that is installed to recognize a specific extension. For example, .R files are executable if the R GUI has the "open action" set to "source as input file". Instead, consider saying "Binary file formats should not be used."

dgkf commented 3 years ago

A quick follow-up, I reached out to the Julia community and asked about unicode best practices. They pointed me to this software, which has gone through FDA review (I think as a device?). It uses unicode characters extensively for code that maps more closely to the formulaic symbols:

https://tutorials.pumas.ai/html/introduction/simulating_populations.html

mstackhouse commented 3 years ago

@dgkf I believe the ascii character restriction goes down to compliance requirements with the eCTD. Languages like SAS can currently include unicode characters as well (as is common in things like laboratory units), but it's common practice to scrub these characters out and reference by byte code if you're programming around those issues. The key here is that we're dealing with the eCTD and not necessarily the software being used to execute the program.

I agree with the "binary file formats" comment. Should there be any language around compiled code? So something like:

Submission of programs with their native file extensions are acceptable, as long as binary file formats and pre-compiled code are not delivered.

lengning commented 3 years ago

additional discussions in #38