gruns / furl

🌐 URL parsing and manipulation made easy.
Other
2.61k stars 151 forks source link

Any way to isolate the subdomain? #106

Open ejohb opened 5 years ago

ejohb commented 5 years ago

Hi,

Is there a furl attribute for the subdomains in URLs - e.g. the ['www'] and ['test','subdomain'] above?

gruns commented 5 years ago

Unfortunately there's currently no reliable way to isolate the subdomain because it's non-trivial to determine the TLD (e.g. .com, .co.uk, etc). See

In the future, I'll add full TLD support to furl.

In the interim, splitting and rejoining the host by periods (.) is a straightforward, intuitive way to isolate the subdomain assuming a simple, one token TLD (e.g. .com, .net, etc).

>>> from furl import furl
>>> f = furl('http://test.subdomain.example.com/')
>>> tld = f.host.split('.')[-1]
>>> tld
'com'
>>> subdomain = '.'.join(f.host.split('.')[:-2])
>>> subdomain
'test.subdomain'

Does this suffice for your needs for now?

bukowa commented 5 years ago

https://github.com/john-kurkowski/tldextract

gruns commented 5 years ago

Resolution of this Issue tied with the resolution of #110.