unitedstates / congress-legislators

Members of the United States Congress, 1789-Present, in YAML/JSON/CSV, as well as committees, presidents, and vice presidents.
Creative Commons Zero v1.0 Universal
2.08k stars 507 forks source link

Update Committes for 118th Congress #875

Closed matsales28 closed 1 year ago

matsales28 commented 1 year ago

Hey folks, I hope y'all are doing well. I noticed that the list of the Committees and Subcommittees (committees-current.yaml) is not up-to-date for the 118th Congress. The source of the data being fetched, as far as I understood the script (I'm not super experienced with Python), is http://clerk.house.gov/xml/lists/MemberData.xml, and currently do not have any committees, which I think would make it impossible to update the committee's file for now, correct me if I'm wrong.

Do y'all know how long it usually takes to update those data on a Congress change? Or there's another source data that I could work on while this one still needs to be updated?

Appreciate any help!

dwillis commented 1 year ago

@matsales28 you've got it pretty much correct. Given the work involved to do it manually, waiting for the clerk's data to be updated is the easier path but not the fastest one. If you need to put something together before then, let me know and I can probably help.

matsales28 commented 1 year ago

It looks like the clerk-house data is updated and reflects the 118th Congress. I tried, with no great success update the data and create a PR. I shouldn't have skipped my python lessons haha, anyway, I would love to get this data updated when possible 😄

JoshData commented 1 year ago

So it seems like in 8435c031125a407acd34f8e81fd61bb6f012d273 we re-did the committee membership scraper and it started only using cached House XML if present, and my cached file on my desktop is dated Nov. 5, 2021 and July 31, 2021 on my laptop, so it might be that we were missing House committee membership info since either of those dates, or at some point I may have even reverted the data to the info as of one of those dates. Ugh.

I'm updating it all now in #876.

matsales28 commented 1 year ago

Hey @JoshData, sorry for the disturbance, but four Committees from the 117th Congress no longer exist. https://www.house.gov/committees/committees-no-longer-standing

Does that need a manual update on the committees-current.yaml file? I've tried to run both historical_committees.py and committees_membership.py to update it but didn't have success.

JoshData commented 1 year ago

I can't remember off hand but if you do a manual update that'd be a good way to start.