nz-social-investment-agency / social_investment_analytical_layer

Standardised tables for doing cross agency work in the IDI - NOTE: Currently being updated due to IDI SAS Grid and Windows 10 migration
GNU General Public License v3.0
12 stars 5 forks source link
analytical idi investment layer sial siu social unit

Social Investment Analytical Layer

GitHub version

Standardised tables for doing cross agency work in the IDI. Refer to CHANGELOG.md for summary of changes.

Overview:

The SIU has created a Social Investment Analytical Layer (SIAL): events-structured tables that arrange a selected subset of the data held in Statistics New Zealand's IDI into a consistent format, making it easier and faster for authorised IDI users (researchers and analysts) to use and understand.

The SIAL is designed to be reusable tool for the purposes of easy creation of variables for analysis and to add extra cost information that is not inherently available in the IDI. It reformats the commonly used social sector tables available in the IDI into tables based on events (events-structured). An event is defined by a start date and an end date during which the individual was engaged in a particular interaction as recorded by an agency (governmental or NGO). The event can be described as being of a particular event-type (like a particular benefit type or a hospitalisation purchase unit code), and the SIAL also attempts to attach a cost for this event. When such costs are not available directly from the IDI, the SIAL augments the IDI data with costing information obtained from specific agencies. For most of the commonly used tables available in the IDI, there is a corresponding events-structured table under the SIAL.

For example, a hospitalisation spell for an individual in SIAL event terms will have the snz_uid, start and end dates of the hospitalisation, the type of event described by the purchase unit code, the cost of that event and the agency and sub-agency information that the individual interacted with. Another example would be a school enrollment, again defined by a start and end date of enrollment, along with the cost of enrollment. If costs are available from the IDI tables, these are directly used. In the above examples, the costs are not readily available in the IDI, but the SIAL has additional cost-related info obtained from the respective govt. agencies that enables putting a price to each of those events. Wherever the costs cannot be quantified, the cost columns are left NULL, or are completely absent. Also, events of an instantaneous nature (for example- purchasing a pharmaceutical drug) will have the same start and end dates.

Folder Descriptions:

SIAL_dependencies: this folder contains the main_sial.sas script that builds all tables/views. It also contains cost data that isn't available in the IDI along with code to load these files in to SQL.
SIAL_bqa_complete: this folder contains scripts that have been reviewed by the corresponding agencies.
SIAL_bqa_incomplete: this folder contains scripts that are currently being reviewed by the corresponding agencies.
SIAL_log: This folder is used to store the output logs that SAS generates. Since SAS does not have easy to use error handling the logs are written to file and scanned for errors.
SIAL_docs: This folder contains the data dictionary for the SIAL and an overview of the datasets timeline.

Pre-requisites

Running the SIAL requires read privileges for the following schemas in the IDI_Clean (for any archive version)-

If you do not have access to any of the above schemas, then the SIAL installation will still create the tables/views pertaining to that schema but will return a “SELECT PERMISSION not allowed” error when trying to utilise the view. It will still create the components that for the schemas that you have access to. At the end of installation, the tool creates a log itemising which components were successfully created, and which ones failed, or have been created but with SELECT Permission denied. It also gives you the reason for failure. To get access to the schemas listed above, please contact Statistics New Zealand.

Installation:

  1. Download the zip file containing the SIAL scripts from github.

  2. Email the zipped file to access2microdata@stats.govt.nz and ask them to move it into your project folder.

  3. Unzip the files into your project.

  4. Within the unzipped folder, navigate to social_investment_analytical_layer\SIAL_dependencies folder. From this folder, open main_sial_sas_grid.sas in SAS EG

  5. Go to the main_sial_sas_grid.sas script- At the top of the script there are three macro variables called idi_refresh, targetschema & sial_code_path

  6. Change the targetschema to the location you wish to write your files to e.g. the SIU schema is DL-MAA2016-15

  7. Change the sial_code_path to the location where you stored the scripts on the network. E.g. for the SIU our location would be \wprdfs08\MAA2016-15 Supporting the Social Investment Unit\social_investment_analytical_layer

  8. Change idi_refresh to the required version of IDI_Clean database, e.g. IDI_CLEAN_20190420. 9. Run the main_sial_sas_grid.sas script.

The scripts runs in the following fashion-

Note that each time you run main_sial.sas it will uninstall all the SIAL tables you currently have before creating the SIAL tables. Once the script finishes running, the WORK.sialexecresults table will give you the errors for each SIAL component from the execution if there are any.

Most of the individual sas programs are simply creating SIAL views and so they run in a matter of seconds. However, the two SIAL tables that are still created as tables SIAL_MSD_T1_events and SIAL_MOE_school_events take a significantly longer time to run. End to end, main_sial.sas should run in around 3 hours.

Output:

The following tables and views will be created in the target schema that you specified under IDI_Sandpit-

SIAL Views (can be found in SQL Management Studio under IDI_UserCode -> target schema -> Views)-

SIAL tables (can be found in SQL Management Studio under IDI_Sandpit -> target schema -> Tables)-

SIAL Supporting tables(can be found in SQL Management Studio under IDI_Sandpit -> target schema -> Tables)-

Uninstallation:

  1. Open uninstall_sial.sas under social_investment_analytical_layer\SIAL_dependencies folder in SAS EG
  2. Go to the uninstall_sial.sas script- At the top of the script there are 2 macro variables called targetschema & sial_code_path
  3. Change the targetschema to the project schema from which the SIAL components are to be removed (for example, the SIU schema is DL-MAA2016-15).
  4. Change the sial_code_path to the location where you stored the scripts on the network
  5. Run the uninstall.sas script.

Known Issues:

  1. IDI_Clean.moh_clean.mortality_registrations table structure has changed since the latest IDI refresh. The latest SIAL script has been modified to accommodate for this change. What this means is that if the SIAL is repointed to IDI_Clean_20161020, then the script 'MIX_mortality_events.sql' is bound to fail. You will need to obtain the earlier version of this SIAL script from Github and replace the new script with the older version.

  2. If the SIAL is to be run pointing to the IDI_Clean_20160715 refresh version, please note that you will have failures for 'MOH_nnpac_events.sql', 'MIX_mortality_events.sql' and 'CYF_client_events.sql'. The underlying IDI tables have changed the strucuture since this refresh, and the latest SIAL tables have also changed to reflect these underlying table-level changes. To circumvent these errors, use an older version of these scripts from Github or make the required changes to these 3 scripts manually by renaming the appropriate columns to reflect what is available in this IDI refresh version. For more guidance/help, contact us at info@siu.govt.nz.

Getting Help:

For more help/guidance in running the SIAL, email info@sia.govt.nz

Tracking number: SIU-2017-0139