cdisc-org / cdisc-rules-engine

Open source offering of the cdisc rules engine
MIT License
48 stars 13 forks source link

not_contains_all not working correctly with get_column_order_from_dataset #484

Open gerrycampion opened 1 year ago

gerrycampion commented 1 year ago

Links to related JIRA Tickets

Rule Information

Describe the bug not_contains_all operator not working correctly with get_column_order_from_dataset operation. Refer to CG0014 (A) in the rule editor and corresponding negative test data.

Rule Type: Dataset Metadata Check
Operations:
  - id: $required_variables
    operator: required_variables
  - id: $dataset_variables
    operator: get_column_order_from_dataset
Check:
  all:
    - name: $dataset_variables
      operator: not_contains_all
      value:
        - $required_variables

get_column_order_from_dataset appears to return a list object, but not_contains_all expects self.value[target] to be a series object in order to apply unique().

Error returned from Rule Engine

{
  "error": "An unknown exception has occurred",
  "message": "unhashable type: 'list'"
}

Expected behavior No errors. The rule should return no issues when all required variables exist or return an issue for missing required variables.

nhaydel commented 1 year ago

@gerrycampion why not just use variable_name as the target instead with a rule type of Variable Metadata Check:

i.e:

Check:
  all:
    - name: variable_name
      operator: not_contains_all
      value:
        - $required_variables

I get that we may want to fix this, but is the rule actually blocked by this bug?

gerrycampion commented 1 year ago

No, that should work, thanks. I updated the issue title

ab3263266 commented 10 months ago

Can I ask variable_name gives the list of all the variables in the dataset? If yes it is very unclear

gerrycampion commented 9 months ago

Can I ask variable_name gives the list of all the variables in the dataset? If yes it is very unclear

variable_name is one of the metadata variables available when using the Variable Metadata Check rule type.

The Variable Metadata Check rule type produces a data frame containing one record for each variable in the current dataset. At the dataset level, you are correct that variable_name gives the list of all variables in the dataset. This means that at the record level, variable_name gives the name of the variable of the current record.

@ab3263266