carsonyl / pypac

Find and use proxy auto-config (PAC) files with Python and Requests.
https://pypac.readthedocs.io
Apache License 2.0
71 stars 19 forks source link

disable tldextract caching #64

Closed mpkuth closed 1 year ago

mpkuth commented 1 year ago

Fetching updated TLD lists is already disabled, so the default TLD list that is included with the library will always be used and there is nothing to cache. Explicitly disabling the cache prevents possible occurrences of john-kurkowski/tldextract#254.


I'm running into https://github.com/john-kurkowski/tldextract/issues/254 when using pypac 0.16.0 on Windows Server 2016.

INFO  2022-11-08T22:06:29.184 Traceback (most recent call last):
  ...my code removed...
  File "C:\Program Files\...\pypac\api.py", line 86, in get_pac
    pac_candidate_urls = collect_pac_urls(from_os_settings=True, from_dns=from_dns)
  File "C:\Program Files\...\pypac\api.py", line 126, in collect_pac_urls
    pac_urls.extend(proxy_urls_from_dns())
  File "C:\Program Files\...\pypac\wpad.py", line 40, in proxy_urls_from_dns
    parsed = no_fetch_extract(local_hostname)
  File "C:\Program Files\...\tldextract\tldextract.py", line 213, in __call__
    return self.extract_str(url, include_psl_private_domains)
  File "C:\Program Files\...\tldextract\tldextract.py", line 228, in extract_str
    return self._extract_netloc(lenient_netloc(url), include_psl_private_domains)
  File "C:\Program Files\...\tldextract\tldextract.py", line 257, in _extract_netloc
    suffix_index = self._get_tld_extractor().suffix_index(
  File "C:\Program Files\...\tldextract\tldextract.py", line 302, in _get_tld_extractor
    fallback_to_snapshot=self.fallback_to_snapshot,
  File "C:\Program Files\...\tldextract\suffix_list.py", line 76, in get_suffix_lists
    hashed_argnames=["urls", "fallback_to_snapshot"],
  File "C:\Program Files\...\tldextract\cache.py", line 206, in run_and_cache
    with FileLock(lock_path, timeout=self.lock_timeout):
  File "C:\Program Files\...\filelock\_api.py", line 220, in __enter__
    self.acquire()
  File "C:\Program Files\...\filelock\_api.py", line 183, in acquire
    raise Timeout(self._lock_file)
filelock._error.Timeout: The file lock 'C:\Program Files\...\tldextract\.suffix_cache/publicsuffix.org-tlds\906337bdfc421126a1477ade77793840.tldextract.json.lock' could not be acquired.

This is a suggested fix from that issue and makes sense to include in this library because pypac isn't using the dynamic TLD list feature of tldextract so there is nothing to cache.

https://github.com/john-kurkowski/tldextract#note-about-caching

Once we made this change locally pypac worked as I expected.

carsonyl commented 1 year ago

Thanks! I'll take your word for it, as I don't have a scenario to repro this issue.