Nonprofit-Open-Data-Collective / nonprofit-open-data-collective.github.io

The location of the GitHub Pages website for the Nonprofit Open Data Collective: www.npdata.info
14 stars 4 forks source link

Picking up Social equity non-profit org that fit with specific purpose. #10

Open mbamanie opened 1 year ago

mbamanie commented 1 year ago

Dear,

I doing a research with a group, and we were wondering if we can use your data or approach to pick specific organization that fit with our classification for social equity.

For example, we want the machine learning to pick non-profit that fit with "affordable housing" or "housing for the elderly" or "public transportation for low-income residents" or "programs for preschool education". Will the data and your approach help us draw non-profit org that fits with our previous classification.

Thank you for your help in advance Mohamad

lecy commented 1 year ago

Hi Mohamad -

There are a couple of different approaches. The most straight-forward is to use some combination of narrow NTEE codes to identify some of the orgs and to create regular expressions that select others using their mission statements or program service accomplishments. An example tutorial here (the quanteda package changed recently, so these function names are outdated, but it gives you an idea of the process):

https://watts-college.github.io/cpp-527-fall-2021/labs/lab-04-instructions.html

Conversely you can hand-code a sample and using your bespoke classification then use it to train a machine-learning classifier. The start-up costs are higher, but you gain a lot in scale if you need to code tens of thousands of cases:

https://github.com/fjsantam/bespoke-npo-taxonomies

more background

Does that answer your question?

Jesse

mbamanie commented 1 year ago

Dear Lucy,

Thank you so much for answering my question and provide me with an example.

Best regards, Mohamad

mbamanie commented 1 year ago

Dear Jesse,

I have another question related to the data. I did examine the data from IRS 1023-EZ website. https://www.irs.gov/charities-non-profits/exempt-organizations-form-1023ez-approvals I noticed that the mission column was only introduced in 2018, so where the program data come from if you could direct me to it, I would appreciate your help.

The second issue was when we compared the data from IRS 1023-EZ, we saw a large drop-down in the number of non-profit org (IRS 1023-EZ 2018 data around 55,000 org) and (NCCS 2015 data around 211,000 org). In our research, we want to cover as much as we can of U.S. nonprofit org. do you suggest we drop using mission and program data since it will decrease the size of our data and not cover all non-profit org, and only focus on NCCS data?

Any recommendations or insights will be appreciated. Thank you for your help Mohamad