Closed brianwc closed 8 years ago
That's sad, but there is some good news. We already have code to handle this kind of heinous camelcasing of case names:
https://github.com/freelawproject/juriscraper/blob/master/lib/string_utils.py#L171
So, I think we just need to throw this into the NY App Div scraper somewhere and we should be off and running again.
Depending on the size of the stuff we already have, we may want to write a fix-script, or alternatively just fix it manually.
There are also tests for fix_camel_case
:
https://github.com/freelawproject/juriscraper/blob/master/tests/tests.py#L388
So if it doesn't quite work for NY App Div at first, we can tweak it without too much worry. Just add more test cases, run them, tweak the code till they pass, etc.
Anybody can take this on. Removing my assignment on this one.
is this still an issue? I found the page with @brianwc's example above, pointed our current nyappdiv_3rd scraper at this specific page, scraped it, dumped the json, and I see what appears to be a properly formatted case name:
{
"case_names": "Maxon Alco Holdings, LLC v. STS Steel, Inc.",
"case_dates": "2016-03-03",
"blocked_statuses": false,
"download_urls": "http://decisions.courts.state.ny.us/ad3/Decisions/2014/517378.pdf",
"precedential_statuses": "Published",
"case_name_shorts": "",
"docket_numbers": "517378"
},
That being said, there is an issue with the parsing of dual docker numbers. Some entries legitimatly show dual docket numbers with a slash delimeter, like xxxxxx/yyyyyy, but it appears that some entries have a slash delimeter, but no second case number (maybe human error?). On the same page linked above, we see "515342/Matter of Neroni v Granis", which parses to:
{
"case_names": "of Neroni v. Granis",
"case_dates": "2016-03-03",
"blocked_statuses": false,
"download_urls": "http://decisions.courts.state.ny.us/ad3/Decisions/2014/515342-515341.pdf",
"precedential_statuses": "Published",
"case_name_shorts": "Granis",
"docket_numbers": "515342/Matter"
},
I can fix this later in the week.
Yeah, let's close this one since it's fixed, and I'll leave the docket issue in your hands? Or we can open another issue for that, if you wish. Be great to get that fixed.
@mlissner yup, I'll submit a PR to fix the docket number issue shortly
See cases from Oct. 23 such as MaxonAlcoHoldings,LLCvSTSSteel,Inc. (N.Y. App. Div. 2014) https://www.courtlistener.com/?q=&stat_Precedential=on&order_by=dateFiled+desc&court=nyappdiv