freelawproject / juriscraper

An API to scrape American court websites for metadata.
https://free.law/juriscraper/
BSD 2-Clause "Simplified" License
362 stars 107 forks source link

Abbreviations should follow blue book format #4

Closed freelawbot closed 10 years ago

freelawbot commented 10 years ago

It seems that the state subdirectory currently has placeholder files that use the postal code abbreviations of the states. Besides the fact that no one but the postmaster general can keep these straight, they are not the abbreviations used in legal citation, per the commandments from on high from our masters at the Bluebook.

With the federal courts we have thus far had the, I think, commendable practice of naming the file after the courtID that will eventually be used in courtlistener, and hence in some search queries made by humans that have to remember these abbreviations. Our legal audience will expect the Bluebook abbreviations.

I'd ask we change all our state placeholder filenames to correspond to the names found in this table: http://www.law.cornell.edu/citation/4-500.htm because I can remember Mass. Minn. Mich. Miss. Mo. but there's no hope for me (or other humans) with the two-letter codes.

Oh, and I'm willing to do this and create a pull request if there is concensus that it's OK to do.


freelawbot commented 10 years ago

Btw, this line in the CourtListener caller requires that the first part of the file name be the court code, so this change was much more than just aesthetic:

https://bitbucket.org/mlissner/search-and-awareness-platform-courtlistener/src/5260b5888f3d/alert/scrapers/scrape_and_extract.py#cl-143


Original Comment By: Mike Lissner

freelawbot commented 10 years ago

Thanks Brian. See my comment above if it didn't email you since I wasn't logged in...closing under the assumption that that's OK with you.


Original Comment By: Mike Lissner

freelawbot commented 10 years ago

It'd make iterating across all scrapers pretty tricky, and would break some imports. Not a bad idea, but probably a separate issue to resolve, I'd say.


Original Comment By: Anonymous

freelawbot commented 10 years ago

Ok, mostly done with this. I've got a question:

Does it screw other things up in the code if we alter the directory structure under united_states/federal?

We've got a lot of hierarchy already, but I actually think that's good and it will be even cleaner if we go with:

united_states/federal/ (maybe only scotus lives here directly, but we could also lump it in appellate and leave this empty)

united_states/federal/special/ (armfor, cavc, cit, cofc, jpml, tax)

united_states/federal/appellate/ (ca1-ca11, cadc, cafc) (and maybe scotus)

united_states/federal/appellate/bankruptcy (bap1, bap9, bap10)

united_states/federal/district/ (OMG there are so many of these)

united_states/federal/district/bankruptcy (there are nearly as many of these as there are district courts).

I won't fiddle with this part until you tell me whether the code references the existing structure somehow.


Original Comment By: Brian Carver

freelawbot commented 10 years ago

Yeah, the postal ones aren't ideal, but were the easiest thing at the time. We just need to make sure to make this change with hg mv, rather than mv, or else next time people pull it'll be havoc for them.


Original Comment By: Mike Lissner