Repository for the code required to run the Social Housing analysis end-to-end.
Note the coding style of this test case has been critiqued. You can read the blog here.
This repository contains all the code required to run the analysis accompanying the Social Housing Report, published by the Social Investment Unit dated 02 June 2017.
Crown copyright ©. This copyright work is licensed under the Creative Commons Attribution 4.0 International licence. In essence, you are free to copy, distribute and adapt the work, as long as you attribute the work to the New Zealand Government and abide by the other licence terms.
To view a copy of this licence, visit https://creativecommons.org/licenses/by-sa/4.0/.
Please note that neither the New Zealand Government emblem nor the New Zealand Government logo may be used in any way which infringes any provision of the Flags, Emblems, and Names Protection Act 1981 or would infringe such provision if the relevant use occurred within New Zealand. Attribution to the New Zealand Government should be in written form and not by reproduction of any emblem or the New Zealand Government logo.
GNU GPLv3 License
Crown copyright (c) 2017, Social Investment Agency on behalf of the New Zealand Government.
See for more details.
The social_housing repository requires you to have access to the Integrated Data Infrastructure (IDI). Within the IDI, you would require a project folder to store all the required code, and a project schema in IDI_Sandpit to store all the data from the social housing analysis.
You need access to the following IDI_Clean schemas-
The Social Investment Analytical Layer (SIAL) should be available in the project schema. Refer to social investment analytical layer repository for instructions on how to get this installed. This version of the social housing repository is compatible with social investment analytical layer version 1.1.0.
The SI Data Foundation also needs to be downloaded and available for use in your project folder. You do not need to run this code explicitly, but there are components from this repository that are being used in the social housing analysis. Download the si_data_foundation repository and keep it in the project folder. This version of the social housing repository is compatible with si_data_foundation version 1.0.0.
The Social Housing Analysis code consists of SAS programs and macros (in ./sasprog/
and ./sasauto/
folders), R programs (in ./rprogs/
) and SQL scripts (in ./sql/
). The code execution is divided into 12 discrete chunks as listed in the sh_main.sas script. Following are the steps required to execute the code end-to-end. This 12-step process has to be run in 4 parts, switching between SAS and R:
The steps to execute the code are-
IDI_Clean_20161020
, then the social_investment_analytical_layer should also use IDI_Clean_20161020
. sasautos
folder under social_housing, and find the SAS script named sh_si_setup.sas
. Open this script in SAS Enterprise Guide. This is the script that is used to set up some universal parameters for the analysis.
si_proj_schema
. This should be assigned the target schema name, where the output tables of the social housing analysis would be written into. sql
folder under social_housing, and open source_data_query.sql
and source_cost_table.sql
. Replace the reference to sasprogs
folder under social_housing, and find the script called sh_main.sas
. This is the main script that runs the analysis end-to-end. Open this script in SAS Enterprise Guide. Notice that the main script has named sections, each of which perform a specific task.
1.SET UP VARIABLES AND MACROS
. This is where you will set up the required variables for your analysis. sasdir
and sasdirgen
, which are the paths to the social housing analysis code and the si data foundation code respectively. There are examples given in the code comments for your reference. IDIrefresh
variable. This variable tells the code to point to the required IDI Refresh version. By default, this value is IDI_Clean
, which ensures that the analysis is done on the latest refresh version available. If you want the analysis to run on older iterations of IDI data, then supply the name of the IDI refresh version that you require. Whatever refresh version you provide here should be the same as the one you used for the social_investment_analytical_layer installation.1.SET UP VARIABLES AND MACROS
. This initialises the parameters of the analysis.2.DEFINE THE COHORT (2005/2006 HNZ APPLICATIONS)
, and provide the cohort that you are interested in. The variables yearfrom
and yearto
are used for this purpose. For example, if you are interested in the individuals/households that applied for social housing between the years 2005 to 2006 (01 Jan 2005 to 31 Dec 2006), use yearfrom = 2005
and yearto = 2006
.2.DEFINE THE COHORT (2005/2006 HNZ APPLICATIONS)
to create the population of interest.rprogs
folder under the project, and find the script named main_part1.R
. This executes all the required analysis to create the weighted and matched treatment and control groups to perform a comparative analysis between those who receive social housing and those who applied but did not receive it. Note that the weighting is only performed for the control group, as the intended analysis for this dataset is to perform the Average Treatment Effect for the Treated(ATET).sh_main.sas
, and look at step 10. This loads the output from propensity matching model into SAS. Execute step 10 and 11 for creating the costs for both groups after the treatment application. rprogs
and find the script named main_part2.R
. Execute this R script to get the confidence intervals for the differences in costs between the two groups.Final results, cost tables and plots would be available in the folder ./output/
. Detailed outputs for each step in the execution can be obtained from Social_Housing_Code_Notes.docx
For more help/guidance in running the SIAL, email info@siu.govt.nz
Tracking number: