Open kaz462 opened 9 months ago
I really like this idea. The only thing I might change is to put the column of functions into the derivation table rather than the var_spec and value_spec so information isn't repeated. Also I would probably just put the derivation order in var_spec rather than var_spec and value_spec
Order of derivation can (and should) be dynamic and automated based on explicit dependencies
It would be good to figure out how to leverage/add to the existing MethodDef structure from CDISC ODMv2 and future Define versions so that it can evolve into a standard shared across the industry and potentially regulators too.
Order of derivation can (and should) be dynamic and automated based on explicit dependencies
@TeMeta , I'm not sure if this would work and if it is desirable. I think there are cases where the author of the specs need to or want to specify the order of the derivations.
Consider for example two derivations. One flags the last record per subject and the other one adds LOCF records. They don't depend on each other. I.e., with respect to the dependencies it doesn't matter in which order they are executed. But the results differ depending on the order. Thus the author needs to specify the order to avoid ambiguity.
For many derivations it doesn't matter in which order they are performed. For example CHG
, PCHG
, APERIOD
, and TRTP
can be derived in any order. But for readability of the specs and the code it is preferable to keep above order. Then the related ones are together. The order CHG
, APERIOD
, PCHG
, and TRTP
would be confusing in the specs and the code (although it would produce correct results).
Therefore I would use the dependencies to automatically generate an initial order of the derivations. Then it would be reviewed by the author and adjusted if necessary. Finally the adjusted order would be checked automatically whether any dependencies are violated.
Order of derivation can (and should) be dynamic and automated based on explicit dependencies
@TeMeta , I'm not sure if this would work and if it is desirable. I think there are cases where the author of the specs need to or want to specify the order of the derivations.
Consider for example two derivations. One flags the last record per subject and the other one adds LOCF records. They don't depend on each other. I.e., with respect to the dependencies it doesn't matter in which order they are executed. But the results differ depending on the order. Thus the author needs to specify the order to avoid ambiguity.
For many derivations it doesn't matter in which order they are performed. For example
CHG
,PCHG
,APERIOD
, andTRTP
can be derived in any order. But for readability of the specs and the code it is preferable to keep above order. Then the related ones are together. The orderCHG
,APERIOD
,PCHG
, andTRTP
would be confusing in the specs and the code (although it would produce correct results).Therefore I would use the dependencies to automatically generate an initial order of the derivations. Then it would be reviewed by the author and adjusted if necessary. Finally the adjusted order would be checked automatically whether any dependencies are violated.
Hi @bundfussr, agreed on all points.
We do want to specify order manually sometimes too. Dependencies are just a very explicit and repeatable way of achieving this.
LOCF is a good example of a missing dependency that needs to be articulated. LOCF that operates on flags does depend on the flag. That dependency creates a dependency on the derivation of that flag, i.e. the derivation of flag must operate first. That dependency can either be predetermined (explicit dependency on the flag) or added post-hoc (hard-coding the order)
It would be good to figure out how to leverage/add to the existing MethodDef structure from CDISC ODMv2 and future Define versions so that it can evolve into a standard shared across the industry and potentially regulators too.
I agree that we need a standard for ADaM specs for making progress in automation. The current standards like define-xml or the Roche-internal format for ADaM specs are not intended for automation. Where suitable the new standard could use elements from existing standards like ODMv2, Define, and ARS.
To enhance the usage of metadata, what do you think of adding the following information to the
metacore
object?metacore$var_spec
andmetacore$value_spec
metacore$var_spec
andmetacore$value_spec
Users can also add these two columns to the P21 Excel Spec (Variables, ValueLevel sheets), so that they can be read in through spec_to_metacore()
Thanks!