elceef / dnstwist

Domain name permutation engine for detecting homograph phishing attacks, typo squatting, and brand impersonation
https://dnstwist.it
Apache License 2.0
4.81k stars 764 forks source link

Cyrillic domains support? #122

Closed sam-pirson closed 2 years ago

sam-pirson commented 3 years ago

Hi! Thanks for your project. Could you add support for сyrillic domains?

VALID_FQDN_REGEX = re.compile(r'(?=^.{4,253}$)(^((?!-)[a-zA-Z0-9-]{1,63}(?<!-)\.)+[a-zA-Z]{2,63}$)', re.IGNORECASE) doesn't match cyrillic domains, for example дом.рф or xn--d1aqf.xn--p1ai. So i get output like this:

user@ubuntu:~$ dnstwist дом.рф
usage: /home/user/.local/bin/dnstwist [OPTION]... DOMAIN
dnstwist: error: invalid domain name: дом.рф
user@ubuntu:~$ dnstwist xn--d1aqf.xn--p1ai
usage: /home/user/.local/bin/dnstwist [OPTION]... DOMAIN
dnstwist: error: invalid domain name: xn--d1aqf.xn--p1ai

You have рф here https://github.com/elceef/dnstwist/blob/master/dictionaries/common_tlds.dict but it never gets into the results.

elceef commented 3 years ago

Pull the most recent version which accepts domain names with internationalized TLD - Unicode or punycode encoded. Please note that the number of permutations will be significantly lower due to lack of Cyrillic homograph mapping. Hopefully this will change in the future.