cdisc-org / cdisc-rules-engine

Open source offering of the cdisc rules engine
MIT License
46 stars 12 forks source link

Rule blocked: CORERULES-9360 #740

Open ASL-rmarshall opened 3 months ago

ASL-rmarshall commented 3 months ago

JSON containing the full request Request.txt

Links to related JIRA Tickets

Rule Information

Describe the bug The rule uses the record_count with the group and filter parameters to create a count of records for each id value where parent_entity = 'StudyCell' and parent_rel = 'elementIds' and rel_type = 'reference' and then tries to report any records for each id value that have rel_type = 'definition' but the grouped record_count result is either empty or equal_to 0. However, although the (negative) test data contains a record that should be reported, the rule reports no errors. This is because, when there are no records meeting the filter criteria for a group, the merged record count column contains NaN, and

  1. The record_count operation does not report 0 (zero) for groups that have no records meeting the filter criteria. See #705
  2. NaN is not recognized as being empty by the empty operator.

"Error" returned from Rule Engine

{
    "STUDYELEMENT": [
        {
            "executionStatus": "success",
            "domain": "STUDYELEMENT",
            "variables": [],
            "message": null,
            "errors": []
        }
    ]
}

Expected behavior The record_count operation should return 0 (zero) for groups with no records meeting the specified filter criteria so that all record_count results can be checked, and any remaining NaN values should be replaced with None so that the rule failure is reported as expected:

{
    "STUDYELEMENT": [
        {
            "executionStatus": "success",
            "domain": "STUDYELEMENT",
            "variables": [
                "id",
                "name",
                "parent_entity",
                "parent_id",
                "parent_rel"
            ],
            "message": "The study element is not referenced by any study cell.",
            "errors": [
                {
                    "value": {
                        "parent_rel": "elements",
                        "name": "EL4",
                        "id": "StudyElement_4",
                        "parent_entity": "StudyDesign",
                        "parent_id": "StudyDesign_1"
                    },
                    "row": 4
                }
            ]
        }
    ]
}
ASL-rmarshall commented 3 months ago

Implementing the changes for #705 should unblock this rule. However, it would be worth considering updating the empty operator to recognize NaN (np.nan) as being empty as a more generic and robust fix:

results = np.where(self.value[target].isin(["", None, {None}, np.nan]), True, False)
                                                            ^^^^^^^^
ASL-rmarshall commented 2 months ago

Changes for #705 have been implemented and this rule is now unblocked. However, I'm leaving this issue open for now so that the suggested fix in my previous comment can be considered.