Closed pbchase closed 1 year ago
FYI, the org data appears to be out-of-date. @pbchase is working with @senrabc to get this data source to refresh faster. I say this not to discourage you from doing this, but to say press on regardless. Don't stop coding to tell me the data is bad 'cause I know it is bad. We will fix it.
There is some concern that we are reading the correct tables. See https://wiki.ctsi.ufl.edu/books/datavivoufledu/page/overview to know where the good data lives
In summary, https://wiki.ctsi.ufl.edu/books/datavivoufledu/page/overview says "use VIVO_DB_NAME=researcher_index_mdm". It should also say "and don't use WH_PRIMARY_DEPTID. Instead use staff_departments"
Addressed by PR #177
Write the ETL
write_uf_fiscal_orgs_to_person_org.R
to populate a new rcc.billing tableperson_org
person_org design
These columns are typically non-blank in the UF person data:
Use these columns in person_org:
Cohort of interest
We will need to fetch and store that data for every redcap user and every project PI email address that does not have a matching email address amongst the primary email addresses in
redcap_user_information
Adding UF fiscal org data
Get org data from the
staff_departments
table in data.vivo like this:staff_departments
returns results very quickly. It might allow all 10k-ish ufids in one query using UFIDs from the previous query.Adding the Department-level ID
Sandra wants to see the department-level id in the data we send her. Whereas we should store primary_uf_fiscal_org, we should also walk up the hierarchy to get the dept-level ID. I believe that is the 2nd-level id. e.g. 29680240 becomes 29680000. It is unclear of the can simply substitute place the last four digits with zeros and use that. Whatever course you choose, add this additional dept_id onto the person record as primary_uf_fiscal_org_2nd_level
Frequency
Run this ETL weekly.