biglocalnews / court-scraper

Scrapers for U.S. county court sites.
ISC License
57 stars 18 forks source link

Add CA Santa Clara search #45

Open zstumgoren opened 3 years ago

zstumgoren commented 3 years ago

Effective March 2021 the county switched to a new case information portal, per announcement on this page:

Important Notice: We will be decommissioning our Tyler Case Information Portal effective March 1, 2021. All existing Tyler portal users should now use our new Case Information Portal located at the following address https://portal.scscourt.org . If you are a Justice Partner who does not have access to our new Portal, please send an email to ssweb@scscourt.org with your request.

The old portal was Tyler Tech/Odyssey platform.

DiPierro commented 3 years ago

I've made good progress on this one. I think a very minor tweak to the odyssey_site module will make it possible to scrape Santa Clara County's portal without issue.

As in #46, we need to bypass the login page in order to scrape this website. Santa Clara County uses login credentials to manage access for users like the District Attorney, Department of Social Services and other authorized public agencies. By signing into the website with an unauthorized account, users from the general public lose their ability to search records.

In the short term, it's relatively easy to circumvent this issue by skipping site.login and instead initializing chromedriver and navigating to the portal page, https://cmportal.scscourt.org/Portal/, when the user calls site.search.

We will need to refactor CaseDetailPage to account for the fact that Santa Clara County does not consistently post financial information for residential unlawful detainers. I've been able to patch this up by prompting the scraper to wait for another element to appear on the detail page.

On another note: Santa Clara actually has a second court portal. They post scanned pdfs of court pleadings on this portal, so depending on the use case, it may be useful to build a scraper for this site, too.

zstumgoren commented 3 years ago

I've updated the ticket description to reflect the phasing out of Tyler Tech portal and cutover to new portal in March 2021

DiPierro commented 3 years ago

You can access html files for CV (civil) and SC (small claims) cases from 2016 through February 2021 here.