john-kurkowski / tldextract

Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).
BSD 3-Clause "New" or "Revised" License
1.81k stars 211 forks source link

Suffix issue #268

Closed avramd02 closed 2 years ago

avramd02 commented 2 years ago

When trying to strip certain urls its returning the wrong suffix.

tldextract.extract("http://blahblah.uk.com/blah/blah") ExtractResult(subdomain='blahblah', domain='uk', suffix='com')

When in reality it should be returning domain = blahblah and suffix = uk.com uk.com is in the public suffix list.

john-kurkowski commented 2 years ago

That suffix is pretty far down the list, on line 10970, so it's in the private domains section. See the FAQ.

avramd02 commented 2 years ago

I see, thanks.