Open zstumgoren opened 3 years ago
@oumnia24 I've updated the san_mateo_indexer.py
script to step through pages based on the presence of the "Next 20" link. This required reworking a fair bit of logic, but on the whole I think the approach is more straight-forward than trying to use the numbered page links.
You'll need to reapply a function (printing
?) to actually extract the case data. I've marked the appropriate location with a TODO
. Also, be on the lookout for downstream bugs related to writing data to CSV. Ping back if you have any questions.
Hi Serdar,
Thank you for taking the time to do that! I’ll let you know if I have any questions.
Best, Oumnia.
From: Serdar Tumgoren @.> Date: Monday, May 24, 2021 at 11:33 AM To: biglocalnews/court-scraper @.> Cc: Oumnia Chellah @.>, Mention @.> Subject: Re: [biglocalnews/court-scraper] CA San Mateo County Case Indexer (#71)
@oumnia24https://github.com/oumnia24 I've updated the san_mateo_indexer.py script to step through pages based on the presence of the "Next 20" link. This required reworking a fair bit of logic, but on the whole I think the approach is more straight-forward than trying to use the numbered page links.
You'll need to reapply a function (printing?) to actually extract the case data. I've marked the appropriate locationhttps://github.com/biglocalnews/court-scraper-etl/blob/main/san_mateo_indexer.py#L89 with a TODO. Also, be on the lookout for downstream bugs related to writing data to CSV. Ping back if you have any questions.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/biglocalnews/court-scraper/issues/71#issuecomment-847246814, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ASHSSQYGQCCBNLTQQDDWVN3TPKLXHANCNFSM42VJAWYQ.
San Mateo County provides a basic Case index that allows search by date, which should allow us to easily compile a historical list of cases and case types.
We'll need the CAPTCHA-protected Odyssey site (#48) to get access to case details.