Open zstumgoren opened 3 years ago
I've made good progress on this one. I think a very minor tweak to the odyssey_site
module will make it possible to scrape Santa Clara County's portal without issue.
As in #46, we need to bypass the login page in order to scrape this website. Santa Clara County uses login credentials to manage access for users like the District Attorney, Department of Social Services and other authorized public agencies. By signing into the website with an unauthorized account, users from the general public lose their ability to search records.
In the short term, it's relatively easy to circumvent this issue by skipping site.login
and instead initializing chromedriver and navigating to the portal page, https://cmportal.scscourt.org/Portal/, when the user calls site.search
.
We will need to refactor CaseDetailPage to account for the fact that Santa Clara County does not consistently post financial information for residential unlawful detainers. I've been able to patch this up by prompting the scraper to wait for another element to appear on the detail page.
On another note: Santa Clara actually has a second court portal. They post scanned pdfs of court pleadings on this portal, so depending on the use case, it may be useful to build a scraper for this site, too.
I've updated the ticket description to reflect the phasing out of Tyler Tech portal and cutover to new portal in March 2021
Effective March 2021 the county switched to a new case information portal, per announcement on this page:
The old portal was Tyler Tech/Odyssey platform.