freelawproject / eyecite

Find legal citations in any block of text
https://freelawproject.github.io/eyecite/
BSD 2-Clause "Simplified" License
118 stars 29 forks source link

Court name errors #128

Open bbernicker opened 2 years ago

bbernicker commented 2 years ago

I was using Eyecite this evening and came across some very odd behavior. I tried to parse the citation Commonwealth v. Muniz, 164 A.3d 1189 (Pa. 2017). For whatever reason, Eyecite determined that the court was "paarbpnlhc" which is the courts_db id for the Pa. Arbitration Panels for Health Care. In courts_db, the citation_string "Pa." is correctly listed under the Supreme Court. I then poked around and determined that it is doing something similar with other courts of last resort. For example, Eyecite treats citations to the Virginia Supreme Court ("Va.") as if they were to the Virginia Court of Appeals ("Va. Ct. App."). It also treats cites to the Texas Supreme Court, "Tex.", as cites to the Texas Special Court of Review ("Tex. Rev.").

I think the problem is that the get_court_by_paren function in helpers.py looks for a citation_string from courts_db which merely starts with the court abbreviation from the citation parenthetical (line 52). Can we require that the whole citation_string matches the whole of court_str? If not, is there a reasonable way to get the shortest citation_string from courts_db which starts with court_str without it taking forever? Perhaps we can put all of the courts_db citation_strings into a set, test whether court_str is present in the set and then, if not, extract all set items which include court_str, test their length, and return the shortest?

I should have time to take a crack at this later in the week unless somebody else either wants to do it or can get to it first.

flooie commented 2 years ago

I'm currently working on a big Courts-db push. Let's move this over there.

mlissner commented 2 years ago

This looks like a nasty one. @bbernicker please let us know if you don't have time to get to this so we can get @flooie cracking on it.

bbernicker commented 2 years ago

@mlissner it turned out not to be too hard to fix and I think I have it sorted out. I also added an extra test to make sure that eyecite is detecting the correct court.