atorus-research / xportr

Tools to build CDISC compliant data sets and check for CDISC compliance.
https://atorus-research.github.io/xportr/
Other
42 stars 9 forks source link

Function Documentation Review #113

Closed bms63 closed 1 year ago

bms63 commented 1 year ago

@atorus-research/xportr-development-team I'm working my way through reviewing each of the functions. Trying to identify parts to update and larger questions that can help simplify this work. Try to complete review tomorrow and get this into a more readable state, feel free to comment on anything :)

Big Questions

xportr_type()

image

xportr_length

Update Title: Assigns a SAS Length to a Variable Attribute Update Details: Variable Level Metadata has a certain connotation in clinical reporting - It should read Assigns a SAS length from the Length Column from an appropriate metatdata source Change Metacore Argument: The argument should be metadata and it can take in dataframe object built from a spec file or a metacore object (link to metacore) Domain: What is a Domain? We should say this refers to the Dataset Column in the appropripate metadata file Verbose: We should highlight that the user can receive feedback around the use of the verbose argument printed out to the console. The use of stop will cause the function to not work, etc. Update Value - Each variable that has a Length specified in the appropriate metadata object will have it applied as a SASlength attribute`

Examples: Can we showcase examples with printed out messages for warn and stop image

xportr_label()

We should avoid the phrase variable level metadata.

Examples: Can we showcase examples with printed out messages for warn and stop image

bms63 commented 1 year ago

I started to put together some conventions for roxygen headers: https://github.com/atorus-research/xportr/wiki/Style-Guide-for-Roxygen-Headers

cpiraux commented 1 year ago

With the new function xportr_metadata, we could explain in detail the metadata required for all functions. For each @param metadata in other functions, have a subset of the details description from xportr_metadata and maybe mention to look at xportr_metadata for more details

In xportrmetadata: @param metadata : short description of columns required for all xportr function. We could also mention that only variable level metadata is needed for xportr

@ details Give more details for each columns: short description, required for which xportr_function, the values allowed,…

cpiraux commented 1 year ago

I was thinking to add in conventions for roxygen headers for example to try to follow define xml specification for metadata example. When we arrive at the stage of xpt, normally the specs should be closed to define xml specification.

The Type should be: text, integer, float, datetime, date, time, (partialdate, partialtime, partialdatetime, incompletedatetime, durationdatetime, intervaldatetime: not very common so could skip them in example) Character type are: text, datetime, date, time Numeric type are: integer, float

For the length, if the type is datetime, date or time then length should be left blank.

What do you think?

bms63 commented 1 year ago

With the new function xportr_metadata, we could explain in detail the metadata required for all functions. For each @param metadata in other functions, have a subset of the details description from xportr_metadata and maybe mention to look at xportr_metadata for more details

In xportrmetadata: @param metadata : short description of columns required for all xportr function. We could also mention that only variable level metadata is needed for xportr

@ details Give more details for each columns: short description, required for which xportr_function, the values allowed,…

  • Order (Display order of the variables within the dataset) - xportr_order
  • dataset (The name of the dataset) – all functions: Used to subset metadata
  • variable (Dataset variable name) – all functions
  • label (Variable label) - xportr_label
  • type (Variable’s data type) - xportr_type xportr.character_types = c("character", "char", "text", "date", "posixct", "posixt"), xportr.numeric_types = c("integer", "numeric")
  • length (Maximum expected variable length.) - xportr_length
  • format (Format supports data visualization of numeric float and date values.) - xportr_format : SAS format allowed

@EeethB something to consider.

I think this should definitely be listed out. I think we should be careful with the phrase value level metadata as I always think of VLMs, but I think when we say it we are referring to a metadata object,

bms63 commented 1 year ago

@cpiraux also is there other gaps in functions or additional functions needed to help us build a complete xpt. I think you and @kaz462 have the most experience in our group for submissions.

cpiraux commented 1 year ago

It would be interesting to have a function that sort the records by key variables (column in dataset metadata) and check the uniqueness of the key.

Define-XML Version 2.0 Completion Guidelines pg 11 image

bms63 commented 1 year ago

It would be interesting to have a function that sort the records by key variables (column in dataset metadata) and check the uniqueness of the key.

Define-XML Version 2.0 Completion Guidelines pg 11 image

So like an xportr_key or should we allow xportr_order to have a key argument that you can turn on/off?

cpiraux commented 1 year ago

I find less confusing to have a function to order the columns and an other function to order the rows by key