Changed record_count.py to remerge results of a filtered grouped count onto the full list of (unfiltered) groups, filling any blanks with zero.
Updated documentation in Operation.md as requested:
Changed single quotes to backticks for parameter names
Corrected "values(s)" typos
Added examples
Issue #706:
Updated Operations.json schema to add group_aliases as an optional property (as a copy of group).
Added grouping_aliases as a new operation parameter in operation_params.py.
Mapped group_aliases to grouping_aliases in rule_processor.py.
Added new methods to base_operation.py:
_rename_grouping_columns to rename each grouping column in grouping with the column name in the corresponding position in grouping_aliases.
_get_grouping_columns to create a list of grouping column names depending on whether grouping_aliases is specified or not.
Updated documentation for both distinct and record_count in Operation.md to describe use of group_aliases.
Created new _replace_nans_in_specified_cols_with_none static method in base_data_service.py.
Updated _handle_grouped_results in base_operation.py to:
Include a call to _rename_grouping_columns if the grouping_aliases operation parameter is used (in addition to grouping).
Handle renamed grouping columns when merging the grouped results back onto the evaluation dataset.
Include a call to _replace_nans_in_specified_cols_with_none so that the NaN values created by the left join (when there is no grouped result for a particular group) do not cause an invalid JSON error when returned to the rules editor.
Updated test_grouped_distinct in test_distinct.py to:
Add grouping_aliases parameter
Include an additional set of parameters that includes specification of grouping_aliases.
Update logic to test for correct application of grouping_aliases.
Updated all tests in test_record_count.py:
test_filtered_record_count:
Moved filter specification to a parameter
Added a parameter set to confirm expected reporting of zero records matching filter.
test_grouped_record_count:
Added grouping_aliases parameter
Included a parameter set with a single-column grouping_aliases
Updated logic to test for correct application of grouping_aliases
test_multi_group_record_count:
Added grouping_aliases parameter
Included parameter sets with a grouping_aliases containing:
The same number of columns as grouping
Few columns than grouping
More columns than grouping
Updated logic to test for correct application of grouping_aliases
test_filtered_grouped_record_count:
Added grouping_aliases parameter
Included parameter set with:
A single-column grouping_aliases
Sets of values to test where:
There are no records matching the filter for an existing group (expecting zero to be reported as per #705 update)
The grouping_aliases column contains a grouping value not in grouping (expected None to be reported)
Updated logic to test for correct application of grouping_aliases
Additional changes:
Corrected more typos in Operations.md:
Un-indented Operations in the example for distinct
Removed trailing backticks from lines in yaml block code in examples for label_referenced_variable_metadata and required_variables
Issue #705:
record_count.py
to remerge results of a filtered grouped count onto the full list of (unfiltered) groups, filling any blanks with zero.Operation.md
as requested:Issue #706:
Operations.json
schema to addgroup_aliases
as an optional property (as a copy ofgroup
).grouping_aliases
as a new operation parameter inoperation_params.py
.group_aliases
togrouping_aliases
inrule_processor.py
.base_operation.py
:_rename_grouping_columns
to rename each grouping column ingrouping
with the column name in the corresponding position ingrouping_aliases
._get_grouping_columns
to create a list of grouping column names depending on whethergrouping_aliases
is specified or not.distinct
andrecord_count
inOperation.md
to describe use ofgroup_aliases
._replace_nans_in_specified_cols_with_none
static method inbase_data_service.py
._handle_grouped_results
inbase_operation.py
to:_rename_grouping_columns
if thegrouping_aliases
operation parameter is used (in addition togrouping
)._replace_nans_in_specified_cols_with_none
so that theNaN
values created by the left join (when there is no grouped result for a particular group) do not cause an invalid JSON error when returned to the rules editor.test_grouped_distinct
intest_distinct.py
to:grouping_aliases
parametergrouping_aliases
.grouping_aliases
.test_record_count.py
:test_filtered_record_count
:test_grouped_record_count
:grouping_aliases
parametergrouping_aliases
grouping_aliases
test_multi_group_record_count
:grouping_aliases
parametergrouping_aliases
containing:grouping
grouping
grouping
grouping_aliases
test_filtered_grouped_record_count
:grouping_aliases
parametergrouping_aliases
grouping_aliases
column contains a grouping value not ingrouping
(expectedNone
to be reported)grouping_aliases
Additional changes:
Operations.md
:Operations
in the example fordistinct
label_referenced_variable_metadata
andrequired_variables