freelawproject / juriscraper

An API to scrape American court websites for metadata.
https://free.law/juriscraper/
BSD 2-Clause "Simplified" License
356 stars 106 forks source link

Need to Implement - x-api-token provided by the reporter of decisions #1042

Closed sentry-io[bot] closed 1 month ago

sentry-io[bot] commented 3 months ago

While debugging a user's report concerning daily search alerts for records originating from NY courts, I found the following Sentry events:

Filed by: @Erosendo

ERosendo commented 3 months ago

The user provided the IDs of alerts they believe are not working. Upon reviewing the queries associated with these alerts, I noticed a common thread: most of the queries include references to the following courts: ny, nyappdiv and nyappterm

grossir commented 3 months ago

I will work on this. All these Sentry issues are for nyappdiv (nyappdiv_1st, nyappdiv_2nd...). If I find any more I will link them here

grossir commented 3 months ago

On a first look, it seems our scraper server IP is blocked. I can access the forbidden URLs without problem

https://www.nycourts.gov/reporter/slipidx/aidxtable_1.shtml https://www.nycourts.gov/reporter/slipidx/aidxtable_2.shtml https://www.nycourts.gov/reporter/slipidx/aidxtable_3.shtml https://www.nycourts.gov/reporter/slipidx/aidxtable_4.shtml

We indeed have no data since May 17, 2024 for nyappdiv

This happens for most New York courts...

@flooie perhaps you can talk with the courts?

grossir commented 3 months ago

Sent a message to this contact form: https://iapps.courts.state.ny.us/webteam/webteam.jsp

flooie commented 3 months ago

I called the New York State Law Reporting Bureau and was transferred to a voicemail - but I did not catch the name - will update the CRM when I hear back. In the meantime - we wait.

flooie commented 3 months ago

the apps.coruts.state.ny.us website is now working -

grossir commented 2 months ago

As an update: we have fresh data for:

Given this, maybe they are simply not blocking us anymore? @flooie

grossir commented 1 month ago

This is still pending to test since our contact in the Court is out on vacation until September 3rd. However, he did clarify something, that the site we target in the ny scraper (example) will not work even if everything goes OK with the API key. So, we will probably have to re-write that scraper.

If your referring to the https://iapps.courts.state.ny.us/lawReporting/Search and https://www.nycourts.gov/ctapps/Decisions pages your right the api-key would not work from the information provided me from our security team. Only sites the api-key should work on is the lrb.nycourts.gov/* and www.nycourts.gov/reporter/*. Are you finding this to be the case?

What do you think @flooie , there is more detail on the email thread

grossir commented 1 week ago

Changes are working, we got our first ny opinion since May https://www.courtlistener.com/opinion/10118643/stefanik-v-hochul/?q=court_id%3Any&type=o&order_by=dateFiled+desc&stat_Published=on