freelawproject / reporters-db

A database of court reporters, tests and other experiments
BSD 2-Clause "Simplified" License
93 stars 34 forks source link

Add support for treaties #121

Open bbernicker opened 2 years ago

bbernicker commented 2 years ago

In response to issue #48 I offered to add support for treaty sources to reporters-db. This PR does so for the treaty sources in Indigo Book T6.

Treaty sources are stored in either the laws.json or reporters.json file depending on which style they more closely resemble. Regardless of where they are stored, they all have cite_type "treaty." The test.py and README.rst files have been updated to recognize "treaty" as a valid cite_type. All examples for all treaties are from Court Listener except for the example for Pan-Am. Treaty Series, which is from a law review article.

I omitted two T6 sources from this PR (Statutes at Large and LEXIS) because they are not exclusive to treaties. Both are already present in reporters-db, and I added a note to the Statutes at Large one explaining that citations there could be to treaties. LEXIS citations are already their own specialty type.

Also, the regex and example for Treaty Series (T.S.) begin with a white space. This is to prevent false positives where T.S. is part of the name of another reporter, but it is an inelegant solution. I tried various lookbehinds and negative assertions, but because they are at the beginning of the string they cause problems.

mlissner commented 2 years ago

Thanks, @bbernicker. I just put this on @flooie's backlog: https://github.com/orgs/freelawproject/projects/27/views/1. We're a bit behind right now, but we'll be climbing out soon and replying to this ASAP.

flooie commented 2 years ago

@bbernicker not to leave this PR alone. Could you resolve the conflicts, at the very least to get our tests to run.

bbernicker commented 2 years ago

Yes sorry @flooie . There were no conflicts when I submitted it, but when you approved the PR request for the public/private law citations it created a conflict, which I have now resolved. This should be ready for your review.

bbernicker commented 2 years ago

@bbernicker this looks great and pretty close to done. I have a few small things to discuss or talk about but not much.

Great, thanks @flooie. I made all of the changes you suggested except that I did not change my treatment of mlz_jurisdiction pending further comments on #130.

bbernicker commented 2 years ago

This should be ready now for a second review.

flooie commented 2 years ago

I have a handful of other things.

  1. Tests fail when run locally. Treaty needs to be added to the tests.
  2. More concerning is there is a discrepancy being caught by the eyecite tests. For example,

In a random opinion see below, the citation here wraps with two new lines as far as I can tell.

image

This citation when run against the current main returns [FullLawCitation('10 U.S.C. §\n\n859', groups={'title': '10', 'reporter': 'U.S.C.', 'section': '859'}, .....

But in your code, and Im not sure why no longer correctly finds this citation and instead replaces it §§

I rebased your code as well onto the current main and it exhibits the same issue.

flooie commented 2 years ago

@bbernicker Thanks. sorry for the delay on this but I wasn't sure what was going on. But there is certainly something here that isn't working. I suspect we may need to add another test to catch whatever is going on here.