Banking Disparities Task 4 -- Tabular Summaries

Tasks:

[x] Create a new Jupyter notebook called tabular-summaries.ipynb
[x] Insert cells that import the pre-processed datasets.
[x] After loading the HMDA data, create a new field called Race_Ethnicity that uses applicant_race_1 and applicant_ethnicity_1 to map applicants to the following categories: (1) White non-Hispanic, (2) White Hispanic, (3) Black, (4) Asian, and (5) Other, where 3,4,5 are each determined by race only (ignoring ethnicity)
[x] Insert cells that use pandas / aggregation operations to prepare summary tables with the following fields.

Notes

In the following, whenever loans are mentioned, note that we are talking about loan originations only. We are focusing on where the loans were actually approved.
Other fields will be added to these tables later. This is just to get us started for now.
Bold fields are the ones we should group by
No need to rename any fields for now. You can pull them as-is from the tables.

Table: Branch-level details Data sources: FDIC locations Fields:

Table: Census tract-level summary Data sources: FFIEC Census, HMDA, SBA, joined by census tract ID Possible join logic:

[HMDA left joined on FFIEC Census], and [SBA left joined on FFIEC Census/HMDA join]
Create intermediate tables that compute sums of unique identifiers within each census tract, and then left join those tables to FFIEC Census data

Note: For HMDA race/ethnicity, for now, focus on applicant_race_1 and applicant_ethnicity_1 Fields:

County name (Dallas, Collin, or Tarrant)
Census tract ID
Total persons (from FFIEC Census data)
Total families (from FFIEC Census data)
Total households (from FFIEC Census data)
Census tract income level -- low, moderate, or high? For now, just use the numeric code. We can infer the low/moderate/high value later.
How many mortgage loans offered in this census tract? (count unique HMDAlei)
How many small business loans offered in this census tract? (count unique SBA Borrower name <-- the version with the index prefix)
Percent mortgage loan originations that went to White non-Hispanic applicants <-- use the new Race_Ethnicity field created above
Percent mortgage loan originations that went to White Hispanic applicants
Percent mortgage loan originations that went to Black applicants
Percent mortgage loan originations that went to Asian applicants
Percent mortgage loan originations that went to Other Race / Ethnicity applicants

Table: County-level summary [IGNORE FOR NOW] Data sources: FDIC Census, HMDA, SBA, joined by census tract ID Fields:

KorelasiDataInsights / southern_dallas_progress