Closed aekey1 closed 4 years ago
@mlissner @flooie
I suppose an introduction is in order. @aekey1, meet @jonashleyo, our resident expert on the judge DB. Jon is a contractor-librarian that's been helping us enhance and improve the judge DB for the past six months or so. Jon, Andrea is our summer intern working on financial disclosure reports that I mentioned.
Jon, do you think you could take a look at these 23 judges that appear to be missing and prioritize putting them in just as subs if needed? The file names above are what we got from the AO.
Thank you both!
Andrea, if you have it easily, could you add the years and maybe the links for these judges? (If not handy, I'll let Jon decide if it's worth doing anyway so he can complete his work.)
(edited to fix misordered links) Here are the years and links, @jonashleyo :
Just did a brief lookup on all of these and it's a mix of judges for courts like Federal Claims and Tax Court, as well as retired & current(?) magistrate judges. I'll take a deeper look at these but Ochsner appears to be "Chief Deputy Clerk" for the Southern District of Texas and should be excluded.
I think it's OK to include people that aren't technically judges. We already have appointers in the DB too, and I think I'm going to request "judicial officers" in addition to judges next year.
Got it. I'll keep Ochsner in the set. Next question...
One of the required fields for people_db is "CL_ID". For FJC data it tends to be something like "fjc-1". For bankruptcy/magistrate judges it's "fjc-mag-1" or "fjc-bk-1". For the above, should the same scheme be used (e.g. "fjc-1" for Tax Court/Federal Claims judges, "fjc-mag-1" for magistrate judges)? Or should something different be used to indicate this data did not come from the FJC but was found otherwise (e.g. "mag-1", "tax-1", etc.)
These IDs are painfully ad hoc. I usually think of them as some sort of source identifier. In this case it's this github issue. Maybe gh-1
?
I have reservations about lumping tax court, magistrate, bankruptcy, and federal claims judges into one pot - there is some information in the way CL_IDs are constructed - but I don't know of a better and easier solution. I'll start pulling together a quick data set to ingest the judges above and use the scheme above (e.g. gh-1).
I've input nearly all of these:
CL_ID : NAME gh-1 : Allegra, Francis Marion gh-2 : Miller, Phillip R. (not asked for but needed to complete gh-1 clerkship info) gh-3 : Bartley, Margaret gh-4 : Bernthal, David G. gh-5 : Braithwaite, Robert T. gh-6 : Brill, Gerrilyn G. gh-7 : Colvin, John O. gh-8 : Davis, Lamar W, Jr. gh-9 : DiGirolamo, Thomas M. gh-10 : Firestone, Nancy B. gh-11 : Foley, George W. gh-12 : Garber, Barry L. gh-13 : Goeke, Joseph Robert gh-14 : Haines, Randolph J. gh-15 : Knowles, Daniel E., III gh-16 : Lauber, Albert G. gh-17 : Mathy, Pamela A. gh-18 : Mirando, Carol gh-19 : Neuchterlein, Christopher gh-20 : Spaulding, Karla R. gh-21 : Treece, Randolph F. gh-22 : Wiese, John Paul gh-23 : Yock, Robert J.
Almost all of these need to be filled out with complete histories/bios but we have records for everyone. The judges here were either retired (so they weren't in the FJC data) or serving on Tax Court or Federal Claims Court (also not in FJC data). One person who wasn't added was Ochsner; I couldn't find his start date and records can't be added without one. Here are the rest of the details so he can be added manually by someone with greater powers:
Nathan K. Ochsner, Chief Deputy Clerk, Southern District of Texas.
I added Ochsner: gh-24. It's a weak profile because I can't add his position without #1079 being fixed first, but it'll get better with a disclosure report at least.
I think we're done here.
Thanks @jonashleyo.
I ran more queries on the rest of the 20k files to look for people matching last names. After requerying, I found 280 last names with no match in people db.
https://docs.google.com/spreadsheets/d/1jXJ04gdjo6sc5vS29-h9zqsIMjnslYaIVRXxY9q7W8k/edit?usp=sharing
Well, shoot, this is a lot of names, but doable in an hour or two. @jonashleyo how do you feel about adding these? Could you find help? Could y'all split this? What do you guys think the easiest way is to get these into the system?
I guess if we had to, we could just add these judges as names only, without including position information, and make a ticket for doing the position information later. Still, we'd at least need their full names (which are in the disclosures).
Using the last name given in the file name, it appears that some of these are misspelled and we have records for them. For example:
Barrry-MT_R_15_Page_068 : Maryanne Trump Barry : https://www.courtlistener.com/person/189/maryanne-trump-barry/
Bennet-RD. J3. 04. MD : Richard D. Bennett : https://www.courtlistener.com/person/253/richard-d-bennett/
Boulare-RF. J3. 09. NV_R_15 : Richard Franklin Boulware : https://www.courtlistener.com/person/338/richard-franklin-boulware-ii/
There are still plenty of records that need to be added, however.
We will Hopefully be extracting better names soon. But it’s sad the file names are poor.
Sent from my iPhone
On Jun 14, 2020, at 10:45 AM, jonashleyo notifications@github.com wrote:
Using the last name given in the file name, it appears that some of these are misspelled and we have records for them. For example:
Barrry-MT_R_15_Page_068 : Maryanne Trump Barry : https://www.courtlistener.com/person/189/maryanne-trump-barry/
Bennet-RD. J3. 04. MD : Richard D. Bennett : https://www.courtlistener.com/person/253/richard-d-bennett/
Boulare-RF. J3. 09. NV_R_15 : Richard Franklin Boulware : https://www.courtlistener.com/person/338/richard-franklin-boulware-ii/
There are still plenty of records that need to be added, however.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.
I re-ran person queries by matching on the last name and first initial I parsed from file names. There were 592 names that with no match in people db. I also parsed names from the judicial watch files and looked for matches on first name and last name, and there were 120 judges in that subset that didn't have a match in people db. https://docs.google.com/spreadsheets/d/1dbp9EVBLm4_FcVJE_MGTkBf4HX4bEeB5iYETzK2trq8/edit?usp=sharing
I also revised our document counts, we originally counted 29,000 individual files, and after grouping the split documents we have 25,500 documents https://docs.google.com/spreadsheets/d/1zPm0wDVNTo-PFugLLhFGvah3w8MWMYTqYsROBATUztI/edit?usp=sharing
Closing. @flooie sent me a big fixture file that had about 500 judges in it, and I used loaddata
to get it into the heart of the beast.
Thank you all for the hard work on this one. Seems like it proved to be quite a challenge.
Earlier today, we queried the people api for last names of judges from our sample set. We found that there were some judges that did not have any matches by last name. I reviewed these manually and was able to match some of them, but there were 23 judges that still had no match.
Allegra-FM J1. DC. CC_R14.tiff Bartley-M. J1. VET 12.tiff Bernthal-DG. M. 07. ILC_Fin_R_16.tiff Braithwaite-RT. MP. 10. UT.tiff Brill-GG. M. 11. GAN_R_13.tiff Colvin-JO. J1. --. TAX_R_13.tiff Davis, Jr.-LW. B. 11. GAS_R_12.tiff DiGirolamo-TM M. 04. MD_SPE_R_13.tiff Firestone-NB. J1. CC.tiff Foley, Jr.-GW. M. 09. NV_R_16.tiff Garber-BL. M. 11. FLS _12.tiff Garber-BL. M. 11. FLS_R_13.tiff Goeke-JR. J1. --. TAX_R_16_Page_02.tiff Goeke-JR. J1. --. TAX_R_16_Page_11.tiff Haines-RJ. B. 09. AZ SPE_R_12.tiff Knowles, III-DE. M. 05. LAE_R_17.tiff Lauber-AG. J1. --. TAX_R_13.tiff Mathy-PA. M. 05. TXW.tiff Mirando-C M. 11. FLM_R_14.tiff Nuechterlein-CA. M. 07. INN.tiff Ochsner-NK. J3. 05. TXS _R_16.tiff Spaulding-KR M. 11. FLM_R_15.tiff Treece-RF M. 02. NYN_R_14.tiff Wiese-JP. J1. --. CC_R_17.tiff Yock-RJ. J1. USCFC_R_12_Page_07.tiff