jnanaswaroop / migration_final

Enlightening healthcare since 2012
0 stars 0 forks source link

Ph 21419: Improve labeling accuracy for local contracts - [merged] #35

Closed jnanaswaroop closed 2 weeks ago

jnanaswaroop commented 1 year ago

In GitLab by @onmay16 on May 10, 2023, 00:46

Merges ph-21419 -> master

Ticket: https://procured.myjetbrains.com/youtrack/issue/ph-21419/CCX-Improve-labeling-accuracy-for-local-contracts

Description & Changes

There are two changes for cleansing contract source type properly from raw_ccx to contracts table.

  1. New mapping pattern -- cleansing_contract_source_type_rule table got updated based on A/C.
  2. Check proper columns -- we've been checking source_contract_type from raw_ccx table that has values 'GPO' and 'Local' only which seems inappropriate to map the right label.
    you can double check this by running (this will run a while though) :
    select source_eid, source_contract_type, source_type from data_prep.raw_ccx group by source_eid, source_contract_type, source_type;

    This is fixed by updating checks from new_table.source_contract_type = clean.pattern to (new_table.source_type = clean.pattern OR new_table.source_eid = clean.pattern) in source.sql script.
    I also checked there is no overlapped value between source_type and source_eid.

I updated cleansing source contract type test to check if the changes work fine.

jnanaswaroop commented 1 year ago

In GitLab by @onmay16 on May 10, 2023, 00:46

requested review from @amehjabeen, @nkessler, and @mjaskiewicz1

jnanaswaroop commented 1 year ago

In GitLab by @mjaskiewicz1 on May 10, 2023, 01:01

approved this merge request

jnanaswaroop commented 1 year ago

In GitLab by @nkessler on May 11, 2023, 01:11

Commented on sql/functions/source.sql line 31

We're getting contract type values in the source EID? I'm unsure what's really going on here

jnanaswaroop commented 1 year ago

In GitLab by @onmay16 on May 11, 2023, 01:25

Yep so this is the pattern how we cleanse contract type from raw_ccx table to contracts table

data = {
        'GPO': ['HPG', 'Premier', 'Vizient', 'MedBuy', 'HealthPro', 'The Resource Group'],
        'Local': ['Premier Custom', 'Local (Vizient)', 'GPO Enhanced', 'GPO Uploaded', 'Government', 'Local', 'Local - Courtesy', 'Local - List price', 'Local - Regional', '3SO', 'EBO'],
    }

And some of the values sit under source_type and some of them are from the source_eid column. These are examples here.

organization source_contract_type source_eid source_type cleansed to
Affiliated Healthcare Systems GPO 671 EBO Local
Affiliated Healthcare Systems GPO HPG GPO GPO
Baptist Health GPO Local (Vizient) GPO Local
Children’s Hospital of Baton Rouge Local 438 Local - List Price Local

'EBO', 'HPG', 'Local (Vizient)', and 'Local - List Price' are used to map these types into either 'Local' or 'GPO'. Another reason we're not using the source_contract_type column is because it's not reliable. You can see that source_contract_type value and cleanse to column have discrepancy in the first case and third case.

jnanaswaroop commented 1 year ago

In GitLab by @nkessler on May 12, 2023, 22:28

approved this merge request

jnanaswaroop commented 1 year ago

In GitLab by @onmay16 on May 12, 2023, 23:35

added 1 commit

Compare with previous version

jnanaswaroop commented 1 year ago

In GitLab by @nkessler on May 15, 2023, 21:37

approved this merge request

jnanaswaroop commented 1 year ago

In GitLab by @onmay16 on May 16, 2023, 00:35

resolved all threads

jnanaswaroop commented 1 year ago

In GitLab by @mjaskiewicz1 on May 16, 2023, 00:53

mentioned in commit 6e9ae7ef9046275ade1ee7f918788119a1c18ffc