kylebaron commented 4 years ago

Summary

As a user, I want to automatically extract information to population $INPUT in a NONMEM control stream

Extract column names in order
Add =DROP to columns that are character
Append short name
Include unit for continuous columns
Add values and decodes for columns where this information is available in the spec object

Tests

tests/testthat/test-nm_input.R
- input text is formed from spec

kylebaron commented 4 years ago

Option 1

> nm_input(spec)
$INPUT
C         ; comment character 
NUM       ; record number 
ID        ; numeric subject identifier 
TIME      ; time after first dose (hour)
SEQ       ; data type 
CMT       ; compartment number 
EVID      ; event ID 
AMT       ; dose amount (mg)
DV        ; DV (ng/ml)
AGE       ; age (years)
WT        ; weight (kg)
CRCL      ; creatinine clearance (ml/min)
ALB       ; Albumin (g/dL)
BMI       ; BMI (m2/kg)
AAG       ; alpha-1-acid glycoprotein (mg/dL)
SCR       ; serum creatinine (mg/dL)
AST       ; aspartate aminotransferase 
ALT       ; alanine aminotransferase 
HT        ; height (cm)
CP        ; Child-Pugh score 
TAFD      ; time after first dose (hours)
TAD       ; time after dose (hours)
LDOS      ; last dose amount (mg)
MDV       ; MDV 
BLQ       ; BLQ 
PHASE     ; study phase indicator 
STUDY     ; study number 
SUBJ=DROP ; subject identifier 
RF=DROP   ; renal function stage 

; DECODE ---------------- 
; SEQ     : 0 = observation, 1 = dose
; EVID    : 0, 1
; CP      : 0 = normal, 1 = Pugh1, 2 = Pugh2, 3 = Pugh3
; MDV     : 0 = not missing, 1 = missing
; BLQ     : 1, 0
; PHASE   : 1
; STUDY   : 1 = 211-EXAMP-001, 2 = 211-EXAMP-011, 3 = 211-EXAMP-801, 4 = 211-EXAMP-802

kylebaron commented 4 years ago

Option 2

> nm_input2(spec)
$INPUT
C         ; comment character 
NUM       ; record number 
ID        ; numeric subject identifier 
TIME      ; time after first dose (hour)
SEQ       ; data type 
          ; 0 = observation, 1 = dose
CMT       ; compartment number 
EVID      ; event ID 
          ; values: 0, 1
AMT       ; dose amount (mg)
DV        ; DV (ng/ml)
AGE       ; age (years)
WT        ; weight (kg)
CRCL      ; creatinine clearance (ml/min)
ALB       ; Albumin (g/dL)
BMI       ; BMI (m2/kg)
AAG       ; alpha-1-acid glycoprotein (mg/dL)
SCR       ; serum creatinine (mg/dL)
AST       ; aspartate aminotransferase 
ALT       ; alanine aminotransferase 
HT        ; height (cm)
CP        ; Child-Pugh score 
          ; 0 = normal, 1 = Pugh1, 2 = Pugh2, 3 = Pugh3
TAFD      ; time after first dose (hours)
TAD       ; time after dose (hours)
LDOS      ; last dose amount (mg)
MDV       ; MDV 
          ; 0 = not missing, 1 = missing
BLQ       ; BLQ 
          ; values: 1, 0
PHASE     ; study phase indicator 
          ; values: 1
STUDY     ; study number 
          ; 1 = 211-EXAMP-001, 2 = 211-EXAMP-011, 3 = 211-EXAMP-801, 4 = 211-EXAMP-802
SUBJ=DROP ; subject identifier 
RF=DROP   ; renal function stage

curtisKJ commented 4 years ago

@kylebaron I like having the decode with the descriptor; it seems easier to parse visually. So option 2 would be my preference, but maybe something like in yspec where there's a sub-descriptor for the decode would be nicer?

KatherineKayMRG commented 4 years ago

@kylebaron something like this would be great. I find the first option a little cleaner but if the column name were unfamiliar to me I would find the decode with the descriptor in option 2 to be quicker/simpler to interpret.

kylebaron commented 4 years ago

@KatherineKayMRG Yeah; I find it easier to digest option 2. But agree with @curtisKJ that option 1 seems more intuitive overall.

kylebaron commented 4 years ago

library(tidyverse)

library(yspec)

sp <- ys_help$spec()

yspec:::nm_input(sp) %>% cat(sep = "\n")
#> $INPUT
#> C         ; comment character 
#>           ; [. = analysis row, C = commented row]
#> NUM       ; record number
#> ID        ; subject identifier
#> SUBJ=DROP ; subject identifier
#> TIME      ; TIME
#> SEQ       ; SEQ 
#>           ; [0 = observation, 1 = dose]
#> CMT       ; compartment number
#> EVID      ; event ID 
#>           ; [values: 0, 1]
#> AMT       ; dose amount
#> DV        ; dependent variable
#> AGE       ; age
#> WT        ; weight
#> CRCL      ; CRCL
#> ALB       ; albumin
#> BMI       ; BMI
#> AAG       ; alpha-1-acid glycoprotein
#> SCR       ; serum creatinine
#> AST       ; aspartate aminotransferase
#> ALT       ; alanine aminotransferase
#> HT        ; height
#> CP        ; Child-Pugh score 
#>           ; [0 = normal, 1 = Pugh1, 2 = Pugh2, 3 = Pugh3]
#> TAFD      ; time after first dose
#> TAD       ; time after dose
#> LDOS      ; last dose amount
#> MDV       ; MDV 
#>           ; [values: 0, 1]
#> BLQ       ; below limit of quantification 
#>           ; [1 = above QL, 0 = below Q]
#> PHASE     ; study phase indicator 
#>           ; [values: 1]
#> STUDY     ; study number 
#>           ; [1 = SAD, 2 = MAD, 3 = Renal, 4 = Hepatic]
#> RF=DROP   ; renal function stage 
#>           ; [norm = Normal, mild = Mild, mod = Moderate,
#>           ;  sev = Severe]

^{Created on 2020-07-02 by the reprex package (v0.3.0)}

metrumresearchgroup / yspec

Extract column names and information for NONMEM $INPUT #8

Summary

Tests

Option 1

Option 2