john-kurkowski / tldextract

Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).
BSD 3-Clause "New" or "Revised" License
1.81k stars 211 forks source link

cn.com is in the PSL list however is not being treated as a suffix #294

Closed ITJamie closed 1 year ago

ITJamie commented 1 year ago

cn.com is in the PSL list however is not being treated as a suffix

>>> tldextract.extract("test.cn.com")
ExtractResult(subdomain='test', domain='cn', suffix='com')
ITJamie commented 1 year ago

figured out the private_domains flag

tldextractor = tldextract.TLDExtract(include_psl_private_domains=True)
tld_data = tldextractor("test.cn.com")