Nonprofit-Open-Data-Collective / titleclassifier

An R package to assign raw nonprofit executive titles from Form 990 Part VII to a well-structured title taxonomy.
https://nonprofit-open-data-collective.github.io/titleclassifier/code-tour/package-demo.html
4 stars 0 forks source link

Museum 2018 data #5

Open XiaofeiXie62 opened 1 year ago

XiaofeiXie62 commented 1 year ago

EIN: 800870522 Org name: OAK AND IVY AFRICAN AMERICAN MUSEU Employee name: BELINDA INGRAM & OLLIE KING In title v6. ,their job title was converted to accountant, but in title.v7 & the final standardized version, the title was N.A.

lecy commented 1 year ago

If it drops between steps 6 and 7 then that means we did not have a title.variant in the doc:

https://docs.google.com/spreadsheets/d/1iYEY2HYDZTV0uvu35UuwdgAUQNKXSyab260pPPutP1M/edit#gid=1464446536

It looks like it is there now - should show up once the data is re-run.

lecy commented 1 year ago

I am re-running the data now with updated code - can share a new arts/museums dataset tomorrow.

XiaofeiXie62 commented 1 year ago

EIN: 510138441 Employee name: MARK WEIKEL Raw title: BOARD MEMBER - UNTIL 09/2018 Through standardization, the date was dropped in title.v2, the dash was converted to "and" in v3, "until" was deleted in v6. But then the "and" was kept in the final version and thus this title didn't standardized as board member or get coded as board. I saw the same problem with other titles like this. Can we revise the code to deal with this? For example, maybe in before v6, drop both "and until" if they are together instead of dropping "until" only?

XiaofeiXie62 commented 1 year ago

EIN: 131624086 Employee name: Kenneth N Weine Raw title: VP EX AFFAIRS, CHIEF COMM OFF I'm assuming the correct spelling of this should be VP external affairs & chief communication officer. The standardized version of the two titles (in v5) came out as VP executive affairs & chief committee officer. I don't know how we can deal with problems like this, but just want to keep a note here so we are aware of it.