gruns / furl

🌐 URL parsing and manipulation made easy.
Other
2.64k stars 153 forks source link

Extra slash in string representation. #103

Open ejohb opened 6 years ago

ejohb commented 6 years ago

Hi,

When I take a schemaless URL, and set a schema, the resulting string has an extra slash.

furl('www.example.com').set(scheme='http').tostr()

'http:///www.example.com'

Is that by design? Can I get rid of it?

gruns commented 6 years ago

Absolutely not by design. Wonderful catch.

I'll rectify this shortly.

gruns commented 6 years ago

I, too, got confused. This is intentional behavior. Albeit a bit of an initial head scratcher.

See https://github.com/gruns/furl/issues/85 and my detailed answer here https://github.com/gruns/furl/issues/85#issuecomment-293483792.

Long story short: www.example.com looks like a domain to you and I, but without URL delimiters (e.g. ://, /, ?, etc) furl can't consistently determine whether www.example.com is a domain or a path. It can be both. It's ambiguous.

Therefore, for consistent behavior, furl parses all such strings as paths.

So, to create your URL as intended, explicitly set www.example.com as the host:

>>> from furl import furl
>>> furl().set(scheme='http', host='www.example.com').tostr()
'http://www.example.com'

Does that answer your question?

gruns commented 5 years ago

Resolution of this Issue tied with the resolution of #110.

pramttl commented 3 years ago

I ran into the same confusion with triple slashes. Just a usability thught:

Can furl support specifying an is_domain boolean parameter that allows furl to treat the url as a domain and not a path..

furl(url, is_domain=True, scheme='https')

That way setting a scheme on a url which does not have one will become easier especially for the case of domains.