selfboot / AnnotatedShadowSocks

Annotated shadowsocks(python version)
Other
3 stars 1 forks source link

Validate a hostname string #41

Open selfboot opened 7 years ago

selfboot commented 7 years ago

From Wiki:

Hostnames are composed of series of labels concatenated with dots, as are all domain names. For example, "en.wikipedia.org" is a hostname. Each label must be between 1 and 63 characters long, and the entire hostname (including the delimiting dots but not a trailing dot) has a maximum of 253 ASCII characters.

The Internet standards (Requests for Comments) for protocols mandate that component hostname labels may contain only the ASCII letters 'a' through 'z' (in a case-insensitive manner), the digits '0' through '9', and the hyphen ('-'). The original specification of hostnames in RFC 952, mandated that labels could not start with a digit or with a hyphen, and must not end with a hyphen. However, a subsequent specification (RFC 1123) permitted hostname labels to start with digits. No other symbols, punctuation characters, or white space are permitted.

In summary:

  1. Each label in hostname is 63 octets or less.
  2. Entire hostname has a maximum of 255 octets(Read What is the real maximum length of a DNS name? for more details).
  3. Label contain only the ASCII letters 'a' through 'z' (in a case-insensitive manner), the digits '0' through '9', and the hyphen ('-').
  4. Label doesn't begin or end with a hyphen.
  5. Hostname can ends at most one dot.

We can use regex to validate a hostname string as follows:

import re
allowed = re.compile("(?!-)[A-Z\d-]{1,63}(?<!-)$", re.IGNORECASE)

def is_valid_hostname(hostname):
    # Hostname must be bytes after encoding.
    if len(hostname) > 255:
        return False
    # strip exactly one dot from the right, if present
    if hostname[-1] == ".":
        hostname = hostname[:-1]
    return all(allowed.match(x) for x in hostname.split("."))

Explantation about regex

Regex101 gives a very good explanation:

image

Snippet can be found here.

Ref

36

Wiki: hostname
Valid characters of a hostname?
Validate a hostname string
What is the real maximum length of a DNS name?
re – Regular Expressions