cdisc-org / cdisc-rules-engine

Open source offering of the cdisc rules engine
MIT License
46 stars 12 forks source link

Allow renaming of "merge key" for operations with `group` #706

Closed ASL-rmarshall closed 2 months ago

ASL-rmarshall commented 4 months ago

Both the distinct and record_count operations allow the use of the group parameter to produce grouped results which can be merged back onto a dataset that contains the specified group variable(s). Lots of DDF rules would be simplified if it were possible to rename the group variable(s) produced by the operation so that the grouped results could be merged back onto a dataset which stores the same grouping values in variables with different names.

For example, in a rule such as DDF99912 in the test environment, the record_count operation is used with "parent_id" specified for group. This generates a list of record counts associated with each of the parents for the records in the specified dataset (in this case, a count of child ScheduleTimeline records for each parent StudyDesign record). To give a single error result for each study design, it would be helpful to be able to merge the results of the record_count operation onto the StudyDesign dataset by matching the parent_id values in the record_count result to the id values in the StudyDesign dataset (e.g., by renaming the grouping column in the record_count result dataframe from "parent_id" to "id").