zendframework / zend-validator

Validator component from Zend Framework
BSD 3-Clause "New" or "Revised" License
181 stars 136 forks source link

Domainkey: Underscore in hostname #102

Open 038291 opened 8 years ago

038291 commented 8 years ago

default._domainkey.example.com

$regexChars = [0 => '/^[a-z0-9\x2d]{1,63}$/i'];

[hostnameInvalidHostnameSchema]: The input appears to be a DNS hostname but cannot match against hostname schema for TLD 'COM' [hostnameInvalidLocalName]: The input does not appear to be a valid local network name

dapphp commented 8 years ago

I second this. Maybe some sort of option to permit this pattern?

CptChaos commented 7 years ago

Using underscores in hostnames can cause more problems on network level. See also 'Restrictions on valid hostnames' on https://en.wikipedia.org/wiki/Hostname.

Also; actually the underscore is the uppercase for "-", as it comes when shift plus "-" are pressed. ;) All hostnames are lowercase.

038291 commented 7 years ago

why it is impossible to realize optional check?

038291 commented 7 years ago

@bacinsky all known dns of the server support validation of underscore in hostname long ago, but developers refer to RFC.

CptChaos commented 7 years ago

@bacinsky You should've known in the first place that underscores are illegal characters for (sub)domainnames, so you can't expect a whole framework to bend for it, just because you want to and because you lack proper implementations. You really should read the Wikipedia article I linked before, as it clearly explaines why an underscore can't be correctly used in hostnames.

CptChaos commented 7 years ago

@bacinsky If you've read the Wikipedia article, you've seen that underscores can be used for something else within DNS. More specific:

While a hostname may not contain other characters, such as the underscore character (_), other DNS names may contain the underscore. Systems such as DomainKeys and service records use the underscore as a means to assure that their special character is not confused with hostnames. For example, _http._sctp.www.example.com specifies a service pointer for an SCTP capable webserver host (www) in the domain example.com. Note that some applications (e.g. Microsoft Internet Explorer) won't work correctly if any part of the hostname contains an underscore character.

So my actual advice is to proper implement things like they were intended to, to avoid problems later on. As hostnames officially can't have underscores, I'd return an error, stating underscores are not allowed, rather than try to look for options to work around it, hoping to fix it (what you did).

038291 commented 7 years ago

@CptChaos "underscore can't be correctly used in hostnames" - please proof. DKIM,DMARC,SRV,TLSA - properly works.

CptChaos commented 7 years ago

@038291 You've actually just proven my point exactly. DKIM, SRV, DMARC, TLSA and such aren't hostnames. More info upon the subject; read the linked wikipedia article and try to read about DNS and how it works and what it's (im)possibilities are.

038291 commented 7 years ago

@CptChaos Leaves so that it is necessary to write the separate DNS RRSet's validator. Hostname validator isn't suitable for these purposes. Thanks.

ps: http://domainkeys.sourceforge.net/underscore.html

Xerkus commented 7 years ago

This looks like a case for a new Domainkey validator

crzdeveloper commented 7 years ago

I solved this problem in a tricky way. I created an FqdnValidator. It has two options: allowUnderscore and allowWildcard. The FqdnValidator internally uses the Zend Hostname validator.

First, the FQDN string (e.g. abc.example.com) is split into the TLD (com), domain (example) and subdomain (abc) parts. It is a little bit against the official terminology, but what is important, is that the underscore as well as a wildcard (*) characters are only allowed for some DNS Records and only as a subdomain part.

So, when the allowUnderscore option is enabled, the underscore is replaced with 'x' character in all subdomains. The domain part (i.e. 'example') is not affected. The string is passed then to the Hostname validator.

When the allowWildcard option is enabled, each subdomain part is checked if it contains only one character and this character is the asterix (*). If this assertion is ok, the asterix is replaced with 'x' and passed to the Hostname validator.

The hardest thing to do here is to split the FQDN into the parts. For instance, some TLDs contain dot inside, i.e. in both cases example.de.com and example.com the domain name part is 'example'. You'll need to maintain the list of TLDs to achieve this splitting.

Sorry, but I can not share the solution I did because the FqdnValidator is a Symfony validator and the list of TLDs is loaded from the DB dynamically.

CptChaos commented 7 years ago

@038291 Sorry for the late response. On the page you linked, you can also read:

'Names that are not host names can consist of any printable ASCII character.'

So yes, hostnames (like for web addresses) actually can't have a _ in them. ;)

@crzdeveloper How would you solve this when a subdomain has an x in it, like when you have example.example2.tld? ;) I would also no use a DB to store tld's, but rather configfiles or something similar. It's not likely TLD's are going to change much. ;) Keeping it from a DB makes it a bit faster, as you do not have to build another connection, query the database and wait for the results.

weierophinney commented 4 years ago

This repository has been closed and moved to laminas/laminas-validator; a new issue has been opened at https://github.com/laminas/laminas-validator/issues/27.