john-kurkowski / tldextract

Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).
BSD 3-Clause "New" or "Revised" License
1.84k stars 210 forks source link

missed .eu.com #223

Closed JerryPan closed 3 years ago

JerryPan commented 3 years ago

tldextract.extract('www.elitejobs.eu.com') ExtractResult(subdomain='www.elitejobs', domain='eu', suffix='com')

This is not correct, 'eu.com' is a valid suffix per https://publicsuffix.org/list/public_suffix_list.dat

JerryPan commented 3 years ago

I don't know why but it missed other top level suffix, like 'xxx.uk.com' Did I miss something?

john-kurkowski commented 3 years ago

I notice your suffix is toward the bottom of the PSL. It's a private domain. You must opt into parsing those. See this section of the README.