UrbanInstitute / nccs

NCCS data platform powered by Jekyll
https://urbaninstitute.github.io/nccs/
6 stars 8 forks source link

What is variable for Total Contributions for CORE2019 PC FULL 990 #25

Open samc022 opened 5 months ago

samc022 commented 5 months ago

Hi,

I am currently trying to access the total contribution amount for this data set. I am aware that the data dictionary hasn't yet come out, but would it be possible to let me know what variable to look at for Total Contributions? "totcntrbgfts" does not exist. Thank you!

Thiyaghessan commented 4 months ago

Hi @lecy,

Could you help me out with this? The crosswalks have TOTCNTRBGFTS mapped to F9_08_REV_CONTR_TOT and list this variable as being present in the 2019-501C3-CHARITIES-PC data set. However, when I look at the column names, TOTCNTRBGFTS is missing,

I checked for the other variables F9_08_REV_CONTR_TOT is mapped to in the crosswalk, such as (CONT, P1TCONT), and they are absent in 2019-501C3-CHARITIES-PC. I string searched for CONT, CNTRB, etc., for permutations of "contribution," and that turned up blank.

I then checked if any variable names in this data set were unmapped in the crosswalk, and that turned up NULL, meaning we have mapped all variables in 2019-501C3-CHARITIES-PC.

I am a little stumped.

lecy commented 4 months ago

Yeah, that's strange. I don't see that it exists under another name. I checked the CORE-2019-NONPROFIT-PC file and it's not there either, which makes me suspect that it was dropped during data processing inadvertently.

It's available in the 2019 SOI extract, though, so you can always just grab it from there then merge the variable to the CORE file:

https://www.irs.gov/pub/irs-soi/19eoextract990.xlsx

https://www.irs.gov/statistics/soi-tax-stats-annual-extract-of-tax-exempt-organization-financial-data

Thiyaghessan commented 4 months ago

Thank you @lecy ! Those were my thoughts this morning, and I am glad I was thinking about it correctly.

@samc022 Currently, you'd have to use the 2019 Extracts from the links above. However, we are working on a harmonized CORE data set that contains standardized variable names and a single data dictionary. When the landing page for that data set is up I will let you know.

We are currently in the final stages of validation before release, so we are on a timeline of weeks, if that is a helpful estimate!

samc022 commented 4 months ago

Hi,

Thank you @lecy and @Thiyaghessan ! I will take a look at the IRS website and then merge it. I appreciate all of your help.

samc022 commented 4 months ago

Hi,

@lecy I apologize if this is a stupid question. When I download the extract from the IRS website in downloads as a .dat. Is that the proper format or am I doing something wrong? I am unclear on how to convert it to .xslx. Thank you!

lecy commented 4 months ago

Not a dumb question - it's an annoying format. Here is a script that shows you how to convert from .dat to .csv files:

https://github.com/lecy/fiscal-health/blob/main/01-data-raw/convert-dat-files.R

lecy commented 4 months ago

Or here is an R function if more convenient:

# example: 
# file on your computer:  "irs-990-pc-soi-extract-2016-DAT.dat"
# filename <- "irs-990-pc-soi-extract-2016-DAT" # no extension

convert_dat_to_csv <- function( filename ) {
  fn <- paste0( filename, ".dat" )
  dat <- read.csv( fn, sep=" ", skip=1 )
  raw.names <- readLines( fn, n=1 )
  dat.names <- strsplit( raw.names, " ")[[1]]
  names( dat ) <- dat.names
  write.csv( dat, file=paste0( filename, ".csv" ), row.names=F )
}

convert_dat_to_csv( "irs-990-pc-soi-extract-2016-DAT" )