metrumresearchgroup / bbi

Next generation modeling platform
12 stars 2 forks source link

Use shrinkage file to get shrinkage data #54

Open david-lyder opened 4 years ago

david-lyder commented 4 years ago

Summary

Currently shrinkage data is sourced from the lst file. This change will require reformatting the shrinkage data type used to create the JSON output as well as parsing the shk file for shrinkage data when the shk file is available.

The shk file supports up to 10 types of shrinkage.

The shrinkage types are defined as: // Type 1=etabar // Type 2=Etabar SE // Type 3=P val // Type 4=%Eta shrinkage SD version // Type 5=%EPS shrinkage SD version // Type 6=%Eta shrinkage based on empirical Bayes Variance (SD version) // Type 7=number of subjects used. // Type 8=%Eta shrinkage variance version // Type 9=%EPS shrinkage variance version // Type 10=%Eta shrinkage based on empirical Bayes Variance (variance version)

UPDATE: It appears the doc is incorrect, and type 9 and 10 are swapped: // Type 9=%Eta shrinkage based on empirical Bayes Variance (variance version) // Type 10=%EPS shrinkage variance version TODO: follow up with Icon about doc error.

Types 4-6 and 8-10 are currently sourced from the lst file. Types 1-3 and 7 are new data items. To support the additional shrinkage data, the parser output structures (which are reflected in the JSON output) must be changed:

Remove: type ShrinkageDetails struct { Eta Shrinkage Ebv Shrinkage Eps Shrinkage } type Shrinkage struct { SD []float64 VR []float64 } Add: type ShrinkageDetails struct { EtaBar []float64 EtaBarSE []float64 Pval []float64 EtaSD []float64 EpsSD []float64 EbvSD []float64 NumSubjects []float64 EtaVR []float64 EpsVR []float64 EbvVR []float64 }

Currently ShrinkageDetails is included as a slice in the CompletionDetails structure, where each element within the slice includes data for a NONMEM method. This remains the same, however, the definition of the ShrinkageDetails type differs as shown above.

To maintain a consistent JSON output, any members of the ShrinkageDetails structure that cannot be accurately parsed from the data will be populated with a default value of the appropriate dimension. The default value is zero.

Default Dimensions The default dimensions of all members in the ShrinkageDetails structure is the number of ETAs defined in ParameterStructures.Omega, or, if available, the number of ETA columns in the shk file.

Scenarios

Nonmem 74: shk and lst files available Parse all shrinkage data from the shk file

Nonmem 74: lst file populated with shrinkage data, no shk file present Parse available shrinkage data from the lst file. This data is currently parsed from the lst file: EtaSD EpsSD
EbvSD EtaVR EpsVR EbvVR This data is available from the lst file: EtaBar EtaBarSE Pval NumSubjects If present the data has this format: ETABAR: 8.2363E-04 -2.1372E-04 -3.0888E-04 SE: 2.2680E-02 1.3341E-02 6.1861E-03 N: 193 193 193 Where EtaBar is ETABAR, EtaBarSe is SE, and NumSubjects is N.

Nonmem 74: no shrinkage data in the lst file, no shk file present To maintain a consistent output structure, the ShrinkageDetails structure must be populated with the appropriate number of elements (one for each method) as well as the appropriate dimension for each value. See "default dimensions" above.

Nonmem 73: shk and lst files available Parse available shrinkage data from the shk file. Shrinkage Types 8-10 are not available in the shk file. If possible parse these values from the lst file (TBD) or populate with default dimensions.

Nonmem 73: lst file populated with shrinkage data, no shk file present This data is currently parsed from the lst file: EtaSD EpsSD
EbvSD EtaVR EpsVR EbvVR This data is available may be available from the lst file: EtaBar EtaBarSE Pval NumSubjects If present the data has this format: ETABAR: 8.2363E-04 -2.1372E-04 -3.0888E-04 SE: 2.2680E-02 1.3341E-02 6.1861E-03 N: 193 193 193 Where EtaBar is ETABAR, EtaBarSe is SE, and NumSubjects is N.

Nonmem 73: no shrinkage data in the lst file, no shk file present To maintain a consistent output structure, the ShrinkageDetails structure must be populated with the appropriate number of elements (one for each method) as well as the appropriate dimension for each value. See "default dimensions" above.

This table summarizes the source of shrinkage data:

Version SHK and LST LST only No Shrinkage data
NM 74 SHK LST + Default values Default values
NM 73 SHK + LST + Default values LST + Default values Default values
       

The format of the shk file allows shrinkage data to be parsed per method. The format of the lst file does not allow for easy parsing per method. When only the lst file is available, the shrinkage data provided per method is TDB.

"bbi summary" will exit and display an error message if a shk file is not found for both NM73 and NM74 control streams.

The flag "no-shk-file" will allow processing to continue if the shk file is not found. When the "no-shk-file" flag is passed, processing continues as though the shk file is not present. The shrinkage data is sourced as shown above in the absence of an shk file.

Outstanding Questions

  1. How to accurately capture ETABAR data from mixture models?
  2. Is shrinkage available in Bayesian runs? No. from the NM doc intro7.pdf, ~line 184: " Shrinkage is not reported after a BAYES or FO analysis."
  3. Is shrinkage available in SAEM runs? - YES from MikeH
Test name Test context
TestSummary TestSummary* tests compare the output of the "summary" command and "summary --json" command to the output of known control streams.  The test will fail if the output is not equal to the known output. The summary -json command outputs all available data, including the data noted in this issue. Representative control streams should be added to the test suite if any differences in their output are identified.
   

Tests

david-lyder commented 4 years ago

Before change:

"shrinkage_details": [ { "eta": { "sd": [ 0.40632, 2.0606, 18.484 ], "vr": [ 0.81099, 4.0787, 33.551 ] }, "ebv": { "sd": [ 0.49256, 2.1487, 18.703 ], "vr": [ 0.98269, 4.2512, 33.908 ] }, "eps": { "sd": [ 9.7026 ], "vr": [ 18.464 ] } } ],

Table:

bbi summary 2 LEM RUN# 2 - 2cmpt model - no BQLs Dataset: ../nobqldata.csv Records: 2895 Observations: 2702 Patients: 193 Estimation Method(s):

david-lyder commented 4 years ago

After change:

"shrinkage_details": [ { "eta_bar": [ 0.000823632, -0.000213715, -0.000308885 ], "ebv_bar_se": [ 0.0226802, 0.0133412, 0.00618608 ], "pval": [ 0.971031, 0.987219, 0.960176 ], "eta_sd": [ 0.40632, 2.0606, 18.4836 ], "eps_sd": [ 9.70259, 0, 0 ], "ebv_sd": [ 0.49256, 2.1487, 18.7028 ], "num_subjects": [ 193, 193, 193 ], "eta_vr": [ 0.810989, 4.07874, 33.5507 ], "eps_vr": [ 0.982693, 4.25123, 33.9076 ], "ebv_vr": [ 18.4638, 0, 0 ] } ],

Table

bbi summary 2 LEM RUN# 2 - 2cmpt model - no BQLs Dataset: ../nobqldata.csv Records: 2895 Observations: 2702 Patients: 193 Estimation Method(s):