Closed SLornieCYC closed 1 year ago
@SLornieCYC So, england_working_days(45) returns an object that should include the information about how to subtract 45 business days from another date, using a set of holidays curated by, I think, the DOJ. It does actually do this, except, it sort of doesn't. Using an online business days calculator I get 30th Jan as the date 45 working days before the collection end, but the cin tool calculates it to be the 27th Jan, so something is clearly going wrong. If it turns out that the dates you have which fail are in that range, that'd be great news as it'll help explain what's going on.
Right, it depends if we include the date of collection end (we should!) I believe the current calculation excludes it, putting the date at the 27th, where if we include it, it becomes the 30th.
This can be fixed by changing the calculation to do 44 working days. I'll make a PR with that, request Tambe, and see if, instead of making this weird fix, we can get the fn. in utils to be inclusive instead.
@willLPD2I Actually the assessments not being flagged in my data all have start dates between November 2022 and 24 January 2023 so seemingly entirely unrelated to the collection end date fix you've identified above!
This is an example of one of the assessments as produced by the DFE XML Generator which is not flagged:
<Assessments>
<AssessmentActualStartDate>2023-01-24</AssessmentActualStartDate>
</Assessments>
@SLornieCYC Bizarre, I'll keep looking!
<Assessments>
<AssessmentActualStartDate>2023-01-24</AssessmentActualStartDate>
</Assessments>
Passes the pytest when added to the sample df, however, when I make a fake child xml and run it on that, it doesn't pass, which says that something is going wrong perhaps outside of the rule, maybe in calculating the reference period. It actually passes whenn using run-all regardless of what the reference date is set to.
If you pop a print statement in the assessments ingress then run-all on fake_CIN_data.xml, it prints empty dataframes, it looks like the ingress for assessments isn't working at all. If you run just 8670Q, it can't find the assessments table for tables affected either.
I have now determined that the assessments block doesn't build if there are no factors identified at assessment, which will always be the case in instances where an assessment hasn't been completed.
Okay, now replicated, the bug is produced because, in the ingress, the blocks are built for factor in assessment_factor, so if a block doesn't have an assessment factor... it doesn't build. This can be fixed by extending the if assessment_factors: block to include an else, which builds when they don't eixst.
Haha brilliant! Glad you managed to track it down to something that makes sense.
Looks like this one is mostly fixed - queries are now flagging in CIN validator when they weren't before. However there are a few included where start date is 27 Jan so the accompanying change to custombusinessday doesn't seem to have worked. These aren't flagged in the DFE portal export so I think Will's theory about collection end date is right at least.
Describe the bug Rule is returning no errors in CIN Validator whereas I get 74 errors in DFE portal from the same data.
Proposed fix (Optional) Not sure what exactly is causing the issue however from looking at the python I have one initial query:
pd.offsets.CustomBusinessDay(n=num_days, calendar=holiday_calendar)
but the function is being subtracted from a date here rather than being applied to a date - so what is the function actually returning?https://github.com/data-to-insight/CIN-validator/blob/dc4559d2362e0efacfe47ebb9b4b3d538c7d0094/cin_validator/rules/cin2022_23/rule_8670Q.py#L50-L51