Closed ASL-rmarshall closed 1 week ago
@ASL-rmarshall I added the changes to the empty operator and updated the test for empty() and sent this ticket to review. There is no logic in Engine that I am aware of where we use {None} as a meaningful placeholder and would want to consider it as anything but empty, but there may be an edge case outside my domain knowledge that others might consider.
The update resolved the issue of {None}
not being recognized as empty - thanks. As mentioned above, the original intent for the rule was to rely on the updates for #706 to allow the results of the distinct
operation to be merged directly into the ScheduleTimeline
dataset. I tried to work around this by merging to specific records in the ScheduledActivityInstance
/ScheduledDecisionInstance
datasets, but ran into another issue related to #705 (a NaN
resulted from merging grouped results to a group without values) and the NaN
was not recognized as being empty by the empty
operator. Implementing the changes for #705 should unblock this rule. However, it would be worth considering updating the empty
operator to recognize NaN
(np.nan
) as being empty as a more generic and robust fix:
results = np.where(self.value[target].isin(["", None, {None}, np.nan]), True, False)
^^^^^^^^
Links to related JIRA Tickets
Rule Information
Describe the problem Background: The intention is to report a single error for each ScheduleTimeline (i.e., each record from the ScheduleTimeline dataset) where the
timelineExitId
column does not contain a value for any of the records in the ScheduledActivityInstance dataset that represent "children" of the ScheduleTimeline (i.e., where ScheduledActivityInstance.parent_id = ScheduleTimeline.id). Thedistinct
operation can be used to generate a unique list ofScheduledActivityInstance.timelineExitId
values grouped byparent_id
, and the new functionality proposed for #706 allows theparent_id
group key to be renamed toid
so that the resultant lists can be merged onto the ScheduleTimeline dataset.Problem: The problem occurs when trying to test if the result of the
distinct
operation contains no values. If there is no value in thetimelineExitId
column for any of the records in a group, thedistinct
operation returns a set containingNone
(i.e.,{None}
) and theempty
operator does not recognize this as being empty (because it only looks for an empty string orNone
).Describe the solution There are several ways in which this could be solved:
empty
operator to consider{None}
to be empty:is_empty_set
operator.Proposed rule logic (based on updating the
empty
operator). See DDF00037 rule in the dev environment.Note that for a full implementation that would work for converted USDM JSON data (i.e., running the rules via the CLI), additional changes are needed so that
domain: SCHEDULEDACTIVITYINSTANCE
in theOperations
section can be replaced with a correctly interpreted reference to the ScheduledActivityInstance "dataset". This is (or will be) addressed under a separate issue.Test data files