atenreiro / opensquat

The openSquat is an open-source tool for detecting domain look-alikes by searching for newly registered domains that might be impersonating legit domains and brands.
https://opensquat.com
GNU General Public License v3.0
685 stars 130 forks source link

RegEx Trial #89

Open atenreiro opened 1 year ago

atenreiro commented 1 year ago

I am trialing a RegEx feature for the openSquat.

git clone https://github.com/atenreiro/regex_opensquat

1- Make sure to install the requirements.txt 2-Modify keywords.txt 3-In the regex_multi.py, choose which domain file you want to use.

@maaaaz give it a try!

Andre

maaaaz commented 11 months ago

Hello @atenreiro,

Thanks, it seems to work well and with good performance ! Just be careful not to use the word "match" as it is now a Python keyword.

I don't think this lib would be useful here, but in case it is: https://github.com/asciimoo/exrex

Cheers!

atenreiro commented 11 months ago

Hello @atenreiro,

Thanks, it seems to work well and with good performance !

Just be careful not to use the word "match" as it is now a Python keyword.

I don't think this lib would be useful here, but in case it is: https://github.com/asciimoo/exrex

Cheers!

I plan to release this feature in the next release. It will allow greater flexibility with the keywords matching.

I also plan to retire Jaro-Winkler and leave Levenshtein only.

Thanks for pinpointing the new Python reserved word. I was not aware.

maaaaz commented 11 months ago

Thanks !

I never tried them yet, but the similarity algorithms of this lib are interesting : https://github.com/typosquatter/ail-typo-squatting