Open mlissner opened 4 years ago
I forgot to mention why these are useful. In https://github.com/freelawproject/courtlistener/issues/299, we've identified that we want to start finding citations that lack page numbers, like, 442 U.S. ___
. If we want to do that, we won't be able to rely on the citation to look them up and instead we'll have the volume number, reporter abbreviation, and if we're lucky, some of the party info.
That means that the party info is the only unique thing we've got, so if we're going to use that, being able to refine by volume date would really help reduce false positives.
That style of citation is generally only used in slip opinions prior to the volume being published. However, if you know the year of the opinion in which you found such a citation, then I think we'd find that the years covered by the cited volume are that very year, maybe +/- 1 year. So, if I find such a citation in an opinion from 2016, then the volume of that citation likely covers opinions from 2016 as well, maybe 2015-17.
On Thu, Mar 26, 2020 at 11:41 AM Mike Lissner notifications@github.com wrote:
I forgot to mention why these are useful. In freelawproject/courtlistener#299 https://github.com/freelawproject/courtlistener/issues/299, we've identified that we want to start finding citations that lack page numbers, like, 442 U.S. ___. If we want to do that, we won't be able to rely on the citation to look them up and instead we'll have the volume number, reporter abbreviation, and if we're lucky, some of the party info.
That means that the party info is the only unique thing we've got, so if we're going to use that, being able to refine by volume date would really help reduce false positives.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/freelawproject/reporters-db/issues/19#issuecomment-604608684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACPKOKR5TZS7L5HZTVSSY3RJOOWPANCNFSM4LUOOL5Q .
That's a really good point, Brian. There's no point in doing what this issue proposes, at least not for the purpose we were contemplating. Thanks.
Note that in #21, @jcushman points out that non-numeric volume numbers are thing, so the above format would have some limitations.
Couple more thoughts:
"end"
date is provided, and a "volumes"
dict is provided, then all valid volumes should be provided so users can assume that not-found volume keys are invalid citations."volumes"
should probably reflect the order of volumes, one way or another. Does it work for it to be an unordered dictionary instead of a list? I think maybe it does -- a user who wanted to see volumes in order could natsort the keys, which works on all the non-numeric volume numbers in CAP, or sort based on end date, which seems like it ought to work. If neither of those is satisfying, though, then maybe "volumes"
should be a list.
One example could be something like:
But that'd create a monster of a JSON file.