weppos / publicsuffix-ruby

Domain name parser for Ruby based on the Public Suffix List.
https://simonecarletti.com/code/publicsuffix
MIT License
617 stars 109 forks source link

Allow default rule to be override #120

Open unixcharles opened 7 years ago

unixcharles commented 7 years ago

It can be a bit redundant to always pass default_rule: nil when validating domains if your use case never needs it.

weppos commented 7 years ago

Hi @unixcharles, sorry for the silence. I just wanted to let you know I read the PR, and I haven't merged it because I'm a little bit torn about promoting this customization.

This library is designed to implement the PSL algorithm, and the official algorithm explain that if no rule matches, * should be used. I decided to leave it customizable on single cases bases because sometimes it makes sense.

Making it super easy may seems to be an open invite to make it the default behavior, and I'm not sure I want to encourage it. In my mind, if you find yourself using this setting quite often, a wrapper seems to be an appropriate solution.

Do you mind to share some extra information on where you had this need, and why you found it useful?

unixcharles commented 7 years ago

We're using the gem to validate TLDs from user input and extract the different part of the domain (tld, sld, trd etc) for public available services. In our context it make no sense to allow non public DNS.

I had a look at the algorithm page but its not clear what is the expected behaviour of *. I was really confused by this behaviour change in 2.0.

weppos commented 7 years ago

I had a look at the algorithm page but its not clear what is the expected behaviour of *. I was really confused by this behaviour change in 2.0.

Fair enough, the documentation may not be the most extensive one. The specific section is the point 2 under algorithm:

If no rules match, the prevailing rule is "*".

* represents a 1-part suffix. For example:

*

foo -> invalid, .foo is suffix
something.foo -> valid: something is the SLD, foo is the suffix
*.foo

foo -> invalid, .foo is suffix
something.foo -> invalid, something.foo is suffix
other.something.foo -> valid: other is the SLD, something.foo is the suffix
pjg commented 7 years ago

Yeah, but in practice this default behaviour is quite problematic when dealing with user input. For example, we have the "URL" input and user will enter about.html.

And guess what, it's a valid tld!

PublicSuffix.valid?('about.html')
=> true

Certainly not something I would expect... In the next line in my code I'm building this URL: http://about.html instead of treating it as path and doing /about.html :/

weppos commented 7 years ago

@pjg the main problem is that you expect this library to be a validation tool, whereas it's not (the PSL is not designed to validate). I think it's simply a problem of expectations.

pjg commented 7 years ago

Sure. But there is no better tool to use for validation, so you have to make do with what there is (and PSL is an awesome project!).