freelawproject / courtlistener

A fully-searchable and accessible archive of court data including growing repositories of opinions, oral arguments, judges, judicial financial records, and federal filings.
https://www.courtlistener.com
Other
548 stars 150 forks source link

Judges not found in people db #1305

Closed aekey1 closed 4 years ago

aekey1 commented 4 years ago

Earlier today, we queried the people api for last names of judges from our sample set. We found that there were some judges that did not have any matches by last name. I reviewed these manually and was able to match some of them, but there were 23 judges that still had no match.

Allegra-FM J1. DC. CC_R14.tiff Bartley-M. J1. VET 12.tiff Bernthal-DG. M. 07. ILC_Fin_R_16.tiff Braithwaite-RT. MP. 10. UT.tiff Brill-GG. M. 11. GAN_R_13.tiff Colvin-JO. J1. --. TAX_R_13.tiff Davis, Jr.-LW. B. 11. GAS_R_12.tiff DiGirolamo-TM M. 04. MD_SPE_R_13.tiff Firestone-NB. J1. CC.tiff Foley, Jr.-GW. M. 09. NV_R_16.tiff Garber-BL. M. 11. FLS   _12.tiff Garber-BL. M. 11. FLS_R_13.tiff Goeke-JR. J1. --. TAX_R_16_Page_02.tiff Goeke-JR. J1. --. TAX_R_16_Page_11.tiff Haines-RJ. B. 09. AZ SPE_R_12.tiff Knowles, III-DE. M. 05. LAE_R_17.tiff Lauber-AG. J1. --. TAX_R_13.tiff Mathy-PA. M. 05. TXW.tiff Mirando-C M. 11. FLM_R_14.tiff Nuechterlein-CA. M. 07. INN.tiff Ochsner-NK. J3. 05. TXS _R_16.tiff Spaulding-KR M. 11. FLM_R_15.tiff Treece-RF M. 02. NYN_R_14.tiff Wiese-JP. J1. --. CC_R_17.tiff Yock-RJ. J1. USCFC_R_12_Page_07.tiff

aekey1 commented 4 years ago

@mlissner @flooie

mlissner commented 4 years ago

I suppose an introduction is in order. @aekey1, meet @jonashleyo, our resident expert on the judge DB. Jon is a contractor-librarian that's been helping us enhance and improve the judge DB for the past six months or so. Jon, Andrea is our summer intern working on financial disclosure reports that I mentioned.

Jon, do you think you could take a look at these 23 judges that appear to be missing and prioritize putting them in just as subs if needed? The file names above are what we got from the AO.

Thank you both!

mlissner commented 4 years ago

Andrea, if you have it easily, could you add the years and maybe the links for these judges? (If not handy, I'll let Jon decide if it's worth doing anyway so he can complete his work.)

aekey1 commented 4 years ago

(edited to fix misordered links) Here are the years and links, @jonashleyo :

filename year last name first initial link(encoded)
Allegra-FM J1. DC. CC_R_14 2014 Allegra FM https://storage.courtlistener.com/financial-disclosures/2014/A-F/Allegra-FM%20J1.%20DC.%20CC_R_14.tiff
Bartley-M. J1. VET _ 12 2012 Bartley M https://storage.courtlistener.com/financial-disclosures/2012/A-H/Bartley-M.%20J1.%20VET%20_%2012.tiff
Bernthal-DG. M. 07. ILC_Fin_R_16 2016 Bernthal DG https://storage.courtlistener.com/financial-disclosures/2016/Bernthal-DG.%20M.%2007.%20ILC_Fin_R_16.tiff
Braithwaite-RT. MP. 10. UT 2011 Braithwaite RT https://storage.courtlistener.com/financial-disclosures/2011/A-E/Braithwaite-RT.%20MP.%2010.%20UT.tiff
Brill-GG. M. 11. GAN_R_13 2013 Brill GG https://storage.courtlistener.com/financial-disclosures/2013/A%20-%20G/Brill-GG.%20M.%2011.%20GAN_R_13.tiff
Colvin-JO. J1. --. TAX_R_13 2013 Colvin JO https://storage.courtlistener.com/financial-disclosures/2013/A%20-%20G/Colvin-JO.%20J1.%20--.%20TAX_R_13.tiff
Davis, Jr.-LW. B. 11. GAS_R_12 2012 Davis, Jr. LW https://storage.courtlistener.com/financial-disclosures/2012/A-H/Davis,%20Jr.-LW.%20B.%2011.%20GAS_R_12.tiff
DiGirolamo-TM M. 04. MD_SPE_R_13 2013 DiGirolamo TM https://storage.courtlistener.com/financial-disclosures/2013/A%20-%20G/DiGirolamo-TM%20M.%2004.%20MD_SPE_R_13.tiff
Firestone-NB. J1. CC 2011 Firestone NB https://storage.courtlistener.com/financial-disclosures/2011/F-M/Firestone-NB.%20J1.%20CC.tiff
Foley, Jr.-GW. M. 09. NV_R_16 2016 Foley, Jr. GW https://storage.courtlistener.com/financial-disclosures/2016/Foley,%20Jr.-GW.%20M.%2009.%20NV_R_16.tiff
Garber-BL. M. 11. FLS   _12 2012 Garber BL https://storage.courtlistener.com/financial-disclosures/2012/A-H/Garber-BL.%20M.%2011.%20FLS%20%20%20_12.tiff
Garber-BL. M. 11. FLS_R_13 2013 Garber BL https://storage.courtlistener.com/financial-disclosures/2013/A%20-%20G/Garber-BL.%20M.%2011.%20FLS_R_13.tiff
Goeke-JR. J1. --. TAX_R_16_Page_02 2016 Goeke JR https://storage.courtlistener.com/financial-disclosures/2016/Goeke-JR.%20J1.%20--.%20TAX_R_16/Goeke-JR.%20J1.%20--.%20TAX_R_16_Page_02.tiff
Goeke-JR. J1. --. TAX_R_16_Page_11 2016 Goeke JR https://storage.courtlistener.com/financial-disclosures/2016/Goeke-JR.%20J1.%20--.%20TAX_R_16/Goeke-JR.%20J1.%20--.%20TAX_R_16_Page_11.tiff
Haines-RJ. B. 09. AZ SPE_R_12 2012 Haines RJ https://storage.courtlistener.com/financial-disclosures/2012/A-H/Haines-RJ.%20B.%2009.%20AZ%20SPE_R_12.tiff
Knowles, III-DE. M. 05. LAE_R_17 2017 Knowles, III DE https://storage.courtlistener.com/financial-disclosures/2017/Knowles,%20III-DE.%20M.%2005.%20LAE_R_17.tiff
Lauber-AG. J1. --. TAX_R_13 2013 Lauber AG https://storage.courtlistener.com/financial-disclosures/2013/H%20-%20M/Lauber-AG.%20J1.%20--.%20TAX_R_13.tiff
Mathy-PA. M. 05. TXW 2011 Mathy PA https://storage.courtlistener.com/financial-disclosures/2011/F-M/Mathy-PA.%20M.%2005.%20TXW.tiff
Mirando-C M. 11. FLM_R_14 2014 Mirando C https://storage.courtlistener.com/financial-disclosures/2014/M-Q/Mirando-C%20M.%2011.%20FLM_R_14.tiff
Nuechterlein-CA. M. 07. INN 2011 Nuechterlein CA https://storage.courtlistener.com/financial-disclosures/2011/N%20-%20Q/Nuechterlein-CA.%20M.%2007.%20INN.tiff
Ochsner-NK. J3. 05. TXS _R_16 2016 Ochsner NK https://storage.courtlistener.com/financial-disclosures/2016/Ochsner-NK.%20J3.%2005.%20TXS%20_R_16.tiff
Spaulding-KR M. 11. FLM_R_15 2015 Spaulding KR https://storage.courtlistener.com/financial-disclosures/2015/S%20-%20Z/Spaulding-KR%20M.%2011.%20FLM_R_15.tiff
Treece-RF M. 02. NYN_R_14 2014 Treece RF https://storage.courtlistener.com/financial-disclosures/2014/R-Z/Treece-RF%20M.%2002.%20NYN_R_14.tiff
Wiese-JP. J1. --. CC_R_17 2017 Wiese JP https://storage.courtlistener.com/financial-disclosures/2017/Wiese-JP.%20J1.%20--.%20CC_R_17.tiff
Yock-RJ. J1. USCFC_R_12_Page_07 2012 Yock RJ https://storage.courtlistener.com/financial-disclosures/2012/S-Z/Yock-RJ.%20J1.%20USCFC_R_12/Yock-RJ.%20J1.%20USCFC_R_12_Page_07.tiff
ghost commented 4 years ago

Just did a brief lookup on all of these and it's a mix of judges for courts like Federal Claims and Tax Court, as well as retired & current(?) magistrate judges. I'll take a deeper look at these but Ochsner appears to be "Chief Deputy Clerk" for the Southern District of Texas and should be excluded.

mlissner commented 4 years ago

I think it's OK to include people that aren't technically judges. We already have appointers in the DB too, and I think I'm going to request "judicial officers" in addition to judges next year.

ghost commented 4 years ago

Got it. I'll keep Ochsner in the set. Next question...

One of the required fields for people_db is "CL_ID". For FJC data it tends to be something like "fjc-1". For bankruptcy/magistrate judges it's "fjc-mag-1" or "fjc-bk-1". For the above, should the same scheme be used (e.g. "fjc-1" for Tax Court/Federal Claims judges, "fjc-mag-1" for magistrate judges)? Or should something different be used to indicate this data did not come from the FJC but was found otherwise (e.g. "mag-1", "tax-1", etc.)

mlissner commented 4 years ago

These IDs are painfully ad hoc. I usually think of them as some sort of source identifier. In this case it's this github issue. Maybe gh-1?

ghost commented 4 years ago

I have reservations about lumping tax court, magistrate, bankruptcy, and federal claims judges into one pot - there is some information in the way CL_IDs are constructed - but I don't know of a better and easier solution. I'll start pulling together a quick data set to ingest the judges above and use the scheme above (e.g. gh-1).

ghost commented 4 years ago

I've input nearly all of these:

CL_ID : NAME gh-1 : Allegra, Francis Marion gh-2 : Miller, Phillip R. (not asked for but needed to complete gh-1 clerkship info) gh-3 : Bartley, Margaret gh-4 : Bernthal, David G. gh-5 : Braithwaite, Robert T. gh-6 : Brill, Gerrilyn G. gh-7 : Colvin, John O. gh-8 : Davis, Lamar W, Jr. gh-9 : DiGirolamo, Thomas M. gh-10 : Firestone, Nancy B. gh-11 : Foley, George W. gh-12 : Garber, Barry L. gh-13 : Goeke, Joseph Robert gh-14 : Haines, Randolph J. gh-15 : Knowles, Daniel E., III gh-16 : Lauber, Albert G. gh-17 : Mathy, Pamela A. gh-18 : Mirando, Carol gh-19 : Neuchterlein, Christopher gh-20 : Spaulding, Karla R. gh-21 : Treece, Randolph F. gh-22 : Wiese, John Paul gh-23 : Yock, Robert J.

Almost all of these need to be filled out with complete histories/bios but we have records for everyone. The judges here were either retired (so they weren't in the FJC data) or serving on Tax Court or Federal Claims Court (also not in FJC data). One person who wasn't added was Ochsner; I couldn't find his start date and records can't be added without one. Here are the rest of the details so he can be added manually by someone with greater powers:

Nathan K. Ochsner, Chief Deputy Clerk, Southern District of Texas.

mlissner commented 4 years ago

I added Ochsner: gh-24. It's a weak profile because I can't add his position without #1079 being fixed first, but it'll get better with a disclosure report at least.

I think we're done here.

Thanks @jonashleyo.

aekey1 commented 4 years ago

I ran more queries on the rest of the 20k files to look for people matching last names. After requerying, I found 280 last names with no match in people db.

https://docs.google.com/spreadsheets/d/1jXJ04gdjo6sc5vS29-h9zqsIMjnslYaIVRXxY9q7W8k/edit?usp=sharing

mlissner commented 4 years ago

Well, shoot, this is a lot of names, but doable in an hour or two. @jonashleyo how do you feel about adding these? Could you find help? Could y'all split this? What do you guys think the easiest way is to get these into the system?

I guess if we had to, we could just add these judges as names only, without including position information, and make a ticket for doing the position information later. Still, we'd at least need their full names (which are in the disclosures).

ghost commented 4 years ago

Using the last name given in the file name, it appears that some of these are misspelled and we have records for them. For example:

Barrry-MT_R_15_Page_068 : Maryanne Trump Barry : https://www.courtlistener.com/person/189/maryanne-trump-barry/

Bennet-RD. J3. 04. MD : Richard D. Bennett : https://www.courtlistener.com/person/253/richard-d-bennett/

Boulare-RF. J3. 09. NV_R_15 : Richard Franklin Boulware : https://www.courtlistener.com/person/338/richard-franklin-boulware-ii/

There are still plenty of records that need to be added, however.

flooie commented 4 years ago

We will Hopefully be extracting better names soon. But it’s sad the file names are poor.

Sent from my iPhone

On Jun 14, 2020, at 10:45 AM, jonashleyo notifications@github.com wrote:

 Using the last name given in the file name, it appears that some of these are misspelled and we have records for them. For example:

Barrry-MT_R_15_Page_068 : Maryanne Trump Barry : https://www.courtlistener.com/person/189/maryanne-trump-barry/

Bennet-RD. J3. 04. MD : Richard D. Bennett : https://www.courtlistener.com/person/253/richard-d-bennett/

Boulare-RF. J3. 09. NV_R_15 : Richard Franklin Boulware : https://www.courtlistener.com/person/338/richard-franklin-boulware-ii/

There are still plenty of records that need to be added, however.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

aekey1 commented 4 years ago

I re-ran person queries by matching on the last name and first initial I parsed from file names. There were 592 names that with no match in people db. I also parsed names from the judicial watch files and looked for matches on first name and last name, and there were 120 judges in that subset that didn't have a match in people db. https://docs.google.com/spreadsheets/d/1dbp9EVBLm4_FcVJE_MGTkBf4HX4bEeB5iYETzK2trq8/edit?usp=sharing

I also revised our document counts, we originally counted 29,000 individual files, and after grouping the split documents we have 25,500 documents https://docs.google.com/spreadsheets/d/1zPm0wDVNTo-PFugLLhFGvah3w8MWMYTqYsROBATUztI/edit?usp=sharing

mlissner commented 4 years ago

Closing. @flooie sent me a big fixture file that had about 500 judges in it, and I used loaddata to get it into the heart of the beast.

Thank you all for the hard work on this one. Seems like it proved to be quite a challenge.