freelawproject / juriscraper

An API to scrape American court websites for metadata.
https://free.law/juriscraper/
BSD 2-Clause "Simplified" License
354 stars 105 forks source link

`nmariana` scraper not working #1120

Closed grossir closed 2 days ago

grossir commented 1 month ago

Sentry Issue: COURTLISTENER-7Y2

gaierror: [Errno -2] Name or service not known
  File "urllib3/connection.py", line 196, in _new_conn
    sock = connection.create_connection(
  File "urllib3/util/connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "socket.py", line 976, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):

NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x7f7ae32939b0>: Failed to resolve 'www.cnmilaw.org' ([Errno -2] Name or service not known)
(1 additional frame(s) were not displayed)
...
  File "urllib3/connectionpool.py", line 490, in _make_request
    raise new_e
  File "urllib3/connectionpool.py", line 466, in _make_request
    self._validate_conn(conn)
  File "urllib3/connectionpool.py", line 1095, in _validate_conn
    conn.connect()
  File "urllib3/connection.py", line 615, in connect
    self.sock = sock = self._new_conn()
  File "urllib3/connection.py", line 203, in _new_conn
    raise NameResolutionError(self.host, self, e) from e

MaxRetryError: HTTPSConnectionPool(host='www.cnmilaw.org', port=443): Max retries exceeded with url: /spm24.php (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f7ae32939b0>: Failed to resolve 'www.cnmilaw.org' ([Errno -2] Name or service not known)"))
  File "requests/adapters.py", line 589, in send
    resp = conn.urlopen(
  File "urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
  File "urllib3/util/retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]

ConnectionError: HTTPSConnectionPool(host='www.cnmilaw.org', port=443): Max retries exceeded with url: /spm24.php (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f7ae32939b0>: Failed to resolve 'www.cnmilaw.org' ([Errno -2] Name or service not known)"))
(4 additional frame(s) were not displayed)
...
  File "cl/scrapers/management/commands/cl_scrape_opinions.py", line 413, in handle
    self.parse_and_scrape_site(mod, options)
  File "cl/scrapers/management/commands/cl_scrape_opinions.py", line 376, in parse_and_scrape_site
    site = mod.Site().parse()
mlissner commented 4 weeks ago

I'd set the priority of this one pretty low if you have others that are failing. Not a lot of important case law from NMI.

flooie commented 2 days ago

They work for me and we have all of the 2024 opinions. Closing this