ccao-data / data-architecture

Codebase for CCAO data infrastructure construction and management
https://ccao-data.github.io/data-architecture/
5 stars 3 forks source link

Add test to ensure omitted classes have `tot57` values for the same tax year #363

Open jeancochrane opened 3 months ago

jeancochrane commented 3 months ago

This is our current understanding of the new omitted class codes:

So from what I understand, the OA2, OA3, and OA5 codes should only exist in the various detail tables (COMDAT, DWELDAT, etc.). If they exist they can basically have any class code in pardat, but should match EX, 100, or their corresponding major class most of the time. Further, if something has an OA class code, it should have a corresponding asmt.tot57 value in the same tax year, so we should add a test for that.

Let's implement that a test specifying that rows in detail tables with OA class codes need to have a corresponding tot57 value in the same year.

jeancochrane commented 3 months ago

@dfsnow After some investigation by @Damonamajor, I think that we may be operating on an incorrect assumption about the data here. Try these two queries to confirm that tot57 is consistently 0 when dweldat and comdat have omitted classes:

select dweldat.class, dweldat.taxyr, asmt.procname, asmt.tot57
from iasworld.dweldat dweldat
left join iasworld.asmt_all asmt
  on dweldat.parid = asmt.parid
  and dweldat.taxyr = asmt.taxyr
  and asmt.deactivat is null
  and asmt.valclass is null
  and asmt.procname in ('CCAOVALUE', 'CCAOFINAL', 'BORVALUE')
  and asmt.rolltype != 'RR'
where dweldat.cur = 'Y'
  and dweldat.deactivat is null
  and dweldat.class like 'OA%'
select comdat.class, comdat.taxyr, asmt.procname, asmt.tot57
from iasworld.comdat comdat
left join iasworld.asmt_all asmt
  on comdat.parid = asmt.parid
  and comdat.taxyr = asmt.taxyr
  and asmt.deactivat is null
  and asmt.valclass is null
  and asmt.procname in ('CCAOVALUE', 'CCAOFINAL', 'BORVALUE')
  and asmt.rolltype != 'RR'
where comdat.cur = 'Y'
  and comdat.deactivat is null
  and comdat.class like 'OA%'

Is something wrong with these queries, or is tot57 not actually consistently 0 when the dweldat or comdat class is omitted?

dfsnow commented 3 months ago

@jeancochrane I don't think the typical filtering rules apply here unfortunately. The OA2 record for the first record returned by your first query has both a non-null valclass (equal to OA2) and a null procname.

jeancochrane commented 3 months ago

OK, I think we can do this, but we'll see incorrect failures until we can update our sqoop job to import past tax years with OA classes. Once that's done, the correct query to find the asmt_all record with a nonzero value for tot57 should look something like this (with dweldat as an example detail table):

select dweldat.parid, dweldat.taxyr, dweldat.class, asmt.valclass, asmt.cur, asmt.tot57
from iasworld.comdat dweldat
left join iasworld.asmt_all asmt
  on dweldat.parid = asmt.parid
  and dweldat.taxyr = asmt.taxyr
  and asmt.valclass like 'OA%'
  and asmt.cur = 'Y'
where dweldat.cur = 'Y'
  and dweldat.deactivat is null
  and dweldat.class like 'OA%'

Since this doesn't neatly map to any of our generic tests due to the complex join, I think we'll probably want to make a view in the qc schema that joins all of the detail tables (dweldat, comdat, etc.) to asmt_all with the filters listed above, and then add two tests on tot57:

  1. accepted_range with min_value: 0 and inclusive: false
  2. not_null
dfsnow commented 2 months ago

@jeancochrane @Damonamajor What's going on with this issue? Looks like the blocker reason is Awaiting updated sqoop job, but sqoop should run each night.

jeancochrane commented 2 months ago

@dfsnow If I recall correctly, the issue was that we weren't sqooping historical data that contained new omitted class codes, so the test was failing incorrectly due to our data warehouse mirror being out of date. Not sure if we've found a way around that problem yet, but I'll defer to your judgment.

dfsnow commented 2 months ago

Got it, in that case, we're blocked by https://github.com/ccao-data/service-sqoop-iasworld/issues/14.