Nonprofit-Open-Data-Collective / irs-990-data-issue-tracker

A place to aggregate questions about IRS 990 data access, documentation, meta-data, and inconsistencies or errors. This is NOT a forum for questions on analyzing the data. Contributors are volunteer experts, not IRS personnel.
https://nonprofit-open-data-collective.github.io/irs-990-data-issue-tracker/
3 stars 0 forks source link

990PF part_v was eliminated in 2021 and subsequent parts renamed accordingly #4

Open HFAwesomeCharts opened 1 year ago

HFAwesomeCharts commented 1 year ago

The 990PF schedule part_v was eliminated in 2021 due to a legislative change, and the names of the other subsequent schedule parts were each reduced by one accordingly. So, for example, part_vi became part_v, and part_viia became part_via.

We use IRSx to process our XML files for analysis. So this meant that we needed to modify all of the metadata CSV files in IRSx to accommodate the new schedule part names (until IRSx can be upgraded).

lecy commented 1 year ago

Any tips for others on how to update the IRSx metadata files?

Not sure if @jsfenfen has plans to update the metafiles?

HFAwesomeCharts commented 1 year ago

I am attaching here my new versions of the five metadata files that I altered.

HOWEVER, I should add a big caveat, which is that I know my changes are not 100% correct.

Specifically: when I pull the new part_xi (old pre-2021 part_xii) into my final JSON data set using these metadata files, for returns filed before 2021, the data from that schedule part now shows up under the new part_xii (jumbled together with the data that is supposed to be in the new part_xii). For returns filed in 2021 or later, the data from the new part_xi now shows up under both the new part_xi and the new part_xii.

This kludge works well enough for me now, because at least I have a way of getting the data I need for all the years I analyze. But it's really not a perfect solution, and I'd love to have somebody show me how to do it correctly!

-- Helen

descriptions.csv groups.csv line_numbers.csv schedule_parts.csv variables.csv