biglocalnews / court-scraper

Scrapers for U.S. county court sites.
ISC License
56 stars 18 forks source link

MacOS install issues #180

Open mikelotz opened 1 year ago

mikelotz commented 1 year ago

Installed via your docs -

$ pipenv install court-scraper
Installing court-scraper...
Installing dependencies from Pipfile.lock (08a108)...
To activate this project's virtualenv, run pipenv shell.
Alternatively, run a command inside the virtualenv with pipenv run.

verified court-scraper command

$ court-scraper info
Available scrapers:
 * CA - Kern (ca_kern)
 * CA - Napa (ca_napa)
 * CA - San Mateo (ca_san_mateo)
 * CA - Sonoma (ca_sonoma)
 * GA - Chatham (ga_chatham)
 * GA - Dekalb (ga_dekalb)
 . . . 
 * WI - Winnebago (wi_winnebago)
 * WI - Wood (wi_wood)

NOTE: Scraper IDs (in parentheses) should be used with the search command's --place-id argument.

When I try to search

$ court-scraper search -p wa_adams -c 08-2-00866-9

I get the following errors:

Traceback (most recent call last):
  File "/.pyenv/versions/3.9.1/bin/court-scraper", line 8, in <module>
    sys.exit(cli())
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/court_scraper/cli.py", line 74, in search
    results = runner.search(**kwargs)
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/court_scraper/platforms/odyssey/runner.py", line 39, in search
    site = SiteKls(
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/court_scraper/platforms/odyssey/site.py", line 16, in __init__
    self.driver = self._init_chrome_driver(headless=headless)
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/court_scraper/base/selenium_site.py", line 14, in _init_chrome_driver
    driver = webdriver.Chrome(
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/selenium/webdriver/chrome/webdriver.py", line 81, in __init__
    super().__init__(
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/selenium/webdriver/chromium/webdriver.py", line 103, in __init__
    self.service.start()
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/selenium/webdriver/common/service.py", line 90, in start
    self._start_process(self.path)
  File "/.pyenv/versions/3.9.1/lib/python3.9/site-packages/selenium/webdriver/common/service.py", line 199, in _start_process
    self.process = subprocess.Popen(
  File "/.pyenv/versions/3.9.1/lib/python3.9/subprocess.py", line 947, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/.pyenv/versions/3.9.1/lib/python3.9/subprocess.py", line 1694, in _execute_child
    and os.path.dirname(executable)
  File "/.pyenv/versions/3.9.1/lib/python3.9/posixpath.py", line 152, in dirname
    p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType

I'm not very familiar with Python so don't know what to look for to fix this issue.

zstumgoren commented 1 year ago

@mikelotz That's an admittedly unfriendly error that crops up when court-scraper can't find the chromedriver binary on your PATH.

Unfortunately, fixing that particular issue won't get you over the finish line. We have two outstanding bugs to restore Odyssey functionality (#181 and #153).

Once those bugs are ironed out, you'll need to sign up for the paid Anti-captcha service and configure your local machine with an API key from that service. It's not expensive, but alas, it is an added hurdle. Details on the service and configuration for court-scraper can be found here: https://court-scraper.readthedocs.io/en/latest/install.html#captcha-protected-sites

Sorry for all the headaches and please ping back if you have any questions!