Run the project by navigating to metoo Github folder (using cd
) and typing ./run_project.sh
into command line.
The following variables are cleaned separately for each state.
juris
takes the following categories
basis
takes the following categories
basis
is defined using the finer grained basis_raw
variable. sh == 1
if basis
is Sexual Harassment or issue
(less common) is Sexual Harassment.
sh == 0
otherwise
sh == .
if sex_cases == 0 & sh == 1
, so a sexual harassment case was filed not on the basis of sex but of race, age, etc. These are weird and we'd like to exclude these altogether.
sex_cases
== 1 includes all cases where basis
has the word Sex.
sex_cases
== 0 otherwise.
This is determined using regexm, which searches for string matches to "Sex". sex_cases
== 1 can include discrimination against men as well. This variable can be used to understand trends in sex-based discrimination cases more generally. This variable should not be used to understand trends in discrimination against women, because it includes discrimination against men.
relief >0
is the total compensation plaintiff received from their case, conditional on winning. If plaintiff received money at both the hearing and court stage, relief is the sum of these.
relief = 0
never
relief = .
when the plantiff lost or when we don't have information on relief.
We should not have relief = 0
if the plaintiff lost. If the plaintiff lost, relief == .
and missing_relief == 1
. If settle == 1
most often relief
is missing, unless the case was resolved by conciliation, in which case relief
may be provided.
win == 1
if outcome explicitly says discrimination was found at hearing (this may be called probable cause); or if case went to court and plaintiff won the case.
win == 0
if outcome explicitly says discrimination was not found (no probable cause)
win == .
otherwise...e.g., if case was settled, was dismissed, etc.
If case went to court and plaintiff won compensation, we do not always make this variable equal 1 because receiving $ may not be an admission of discrimination being found.
settle == 1
if outcome says case was settled, case was withdrawn with benefits, or case was resolved by conciliation (an administrative process by which two parties resolve their dispute without involving a hearing or court, we can often observe the relief for these cases).
settle == 0
if otherwise
settle == .
never
court == 1
if data is court data, if outcome says "Notice of Right to Sue" was issued, or outcome says case went to court.
court == 0
if otherwise
court == .
never
The following variables are cleaned altogether, after state and federal data is appended together.
overlap == 1
if sh ==1
and case filed before MeToo and resolved after MeToo
overlap == 0
if sh ==1
and case filed before MeToo and resolved before MeToo
overlap == .
if sh == 0
overlap == .
if case filed after MeToo
post = 1
if file date after MeToo.
post = 0
if file date before MeToo.
post = .
never
treat = post*sh
treat = .
if sex_cases == 1 & sh == 0
, since we don't want the control group to include potentially treated sex cases that are not sexual harassment.
treat = 1
if overlap == 1
since overlap cases are sh AND treated, but definition of post doesn't capture them.
victim_f = 1
if complainant/victim is female
victim_f = 0
if complainant/victim is male
victim_f = .
if missing data on complainant gender
$raw_data/FL/fl_raw_cases.dta
These data contain all employment and housing cases filed in Delaware from June 1, 2010 to June 30, 2023.
$raw_data/DE/de_raw_cases.xlsx
These data contain all housing and public accomodations cases filed in Delaware from June 1, 2010 to June 30, 2023.
$raw_data/KY/ky_raw_cases.dta
These data contain all employment and housing cases filed in Kentucky from June 1, 2010 to June 30, 2023.
$raw_data/WI/wi_raw_cases.dta
These data contain all education, employment, housing, and public accommodation/public service cases filed in Wisconsin from June 1, 2017 to June 30, 2023.
$raw_data/AK/ak_raw_cases.csv
These data contain all employment, housing, and public accommodation/public service cases filed in Alaska from June 1, 2010 to June 1, 2023.
$raw_data/PA/PA_raw_cases_severity.csv
This file merges PA data from two sources. The first source is a spreadsheet of 21 cases from the public hearing docket (i.e., publicly available cases) of the Pennsylvania Human Relations Commission. We do not have PDFs of these cases. The second source is a set of ~200 PDFs of final orders issued by the Commission based on charges of discrimination which were filed during the period between April 9, 1963 to March 22, 2022. These PDFs were manually digitized by Jacob Hirschhorn.
$raw_data/RI/ri_raw_cases.csv
These data were extracted from PDF files of copies of all decision and orders issued by the Commission based on charges of employment discrimination which were filed during the period between June 1, 2010 to December 31, 2022.
$raw_data/MN/mn_raw_cases.xlsx
These data contain all employment, housing, public accommodation/public service, and education cases filed in Minnesota from June 1, 2010 to June 1, 2023 (document seems to contain only up to 2019).
$raw_data/WA/wa_raw_cases.dta
These data contain all employment, housing, public accommodation/public service, and education cases filed in Washington from June 1, 2010 to June 1, 2023.
$raw_data/ND/nd_raw_cases.dta
These data contain all employment, housing, and public accommodation/public service cases filed in North Dakota from June 1, 2010 to June 1, 2023.
$raw_data/IL/il_raw_cases.csv
These data contain all education, employment, housing, and public accommodation/public service cases filed in Illinois from June 1, 2010 to June 1, 2023.
$raw_data/TX/tx_raw_cases.dta
These data contain all housing cases filed in Texas from June 1, 2010 to June 1, 2023.
$raw_data/HI/hi_raw_cases.xls
These data contain all employment, housing, and public accommodation/public service cases filed in Hawaii from June 1, 2010 to June 1, 2023.
N: 3790
SH: 161
CaseType: identifies whether the case is employment, housing, or public accommodations
Island: identifies on which island the case was filed
Docket: unique identifier for each case
EEOC No.: corresponding EEOC case number
ComplaintFiled: date complaint was filed
Basis: basis of discrimination alleged by complainant
AdverseAct: adverse action alleged by complainant
Closed: date complaint was closed at investigation stage
Closure Code: code for closure at investigation stage
Enf Closure: date complaint was closed at enforcement stage
Enf Closure Code: code for closure at enforcement stage
Compensation: amounts paid directly to the claimant
Cases labeled win == 1
if closure code “ORDER” in the enforcement closure column with compensation to complainant (equivalent of the case going to a hearing and the complainant winning)
Cases labeled win == 1
and settle == 1
if closure code "CA" in the investigation closure column and closure code “SETTLED” in the enforcement closure column
$raw_data/MI/mi_raw_cases.xlsx
These data contain all education, employment, housing, and public accommodation/public service cases filed in Michigan from June 1, 2010 to June 1, 2023.
$raw_data/MA/ma_raw_cases.xlsx
This data contains all housing and employment discrimination cases filed in Massachussets between XX and XX.
$clean_data/clean_eeoc.dta
This contains all court cases the EEOC filed on behalf of plaintiffs 2010-2022. We retrieved these data by sending a FOIA request for all cases filed. However, this request was denied and only cases where the EEOC took the charge to court were provided. This dataset is constructed by digitizing $raw_data/EEOC/DATA - 2010-2022 Resolutions as of 08.25.23.pdf using Python. The resulting .csv is called $raw_data/EEOC/cases.csv.
$raw_data/EEOC/filed_11_17.txt
This contains all employment discrimination cases filed with EEOC for fiscal years 2011 to 2017. N = 3,443,510. These data encompasses charges filed with the EEOC and charges filed with state and local fair employment practices agencies alleging violations of federal anti-discrimination laws. Some cases that went to court have their court information, if relevant.