data-to-insight / csc-validator-be-cin

1 stars 1 forks source link

Bug: Rule 8565 - False positives in CINEpisodes where DateOfInitialCPC is null #369

Closed SLornieCYC closed 1 year ago

SLornieCYC commented 1 year ago

Describe the bug Some CINEpisodes records are being incorrectly flagged against this rule for having a null DateOfInitialCPC, where the episode has a CINclosureDate recorded. This is an optional field that is used for a specific subset of cases only (i.e. when a child transfers-in from another LA and was on CPP in that previous LA).

To Reproduce Steps to reproduce the problem, e.g.:

Expected behavior CINEpisode should not be flagged as in-error from the null DateOfInitialCPC.

Proposed fix (Optional) I think condition8 here also needs to test df["DateOfInitialCPC_CIN"].notna().

https://github.com/data-to-insight/CIN-validator/blob/dc4559d2362e0efacfe47ebb9b4b3d538c7d0094/cin_validator/rules/cin2022_23/rule_8565.py#L193

SLornieCYC commented 1 year ago

Possibly all of the conditions should exclude null dates as the rule guidance says "...the following dates that are present" (emphasis mine).

WillLP-code commented 1 year ago

Hi @SLornieCYC I'm struggling to recreate this in the pytest, at the moment I have added a test case to the sample_CIN df:

            {
                "LAchildID": "child10",
                "CINdetailsID": "cinID1",
                "CINclosureDate": "01/01/2022",
                "DateOfInitialCPC": pd.NA
                # Pass
            },

Which should fail the pytest, but isn't. Obviously I need to replicate the failure before I write a fix, I was wondering if you could anonymise the block of the xml for someone who this gives a false positive for this so I can include it as a test case? I'm aware you're busy with stat returns so there is no rush!

SLornieCYC commented 1 year ago

@willLPD2I

Here is some sample data that demonstrates the issue;

<Child>
<ChildIdentifiers>
<LAchildID>CHILD</LAchildID>
</ChildIdentifiers>
<CINdetails>
<CINclosureDate>September 2022</CINclosureDate>
<Assessments>
<AssessmentActualStartDate>July 2022</AssessmentActualStartDate>
<AssessmentAuthorisationDate>September 2022</AssessmentAuthorisationDate>
</Assessments>
<Section47>
<S47ActualStartDate>July 2022</S47ActualStartDate>
<ICPCnotRequired>1</ICPCnotRequired>
</Section47>
<ReferralNFA>0</ReferralNFA>
</CINdetails>
</Child>

In the DFE error output this returns as: (flagging the AssessmentAuthorisationDate being after CINclosureDate)

image

In CIN Validator this returns as: (no I don't know why the rows are all duplicated)

image

Note that the CIN Validator output includes DateOfInitialCPC from CINdetails but the DFE portal output doesn't. It's definitely not the DateOfInitialCPC from the S47 because as per #370 that's not actually covered by the coding for this rule right now.

And this is how the CIN Validato front-end looks:

image