Ultimate-Hosts-Blacklist / dev-center

The place to talk about our infrastructure or everything related to the Ultimate Hosts Blacklist project.
MIT License
11 stars 2 forks source link

Maybe you can skip the generation of `www.` on some TDL #29

Closed funilrys closed 4 years ago

funilrys commented 5 years ago

@funilrys let me hear your thoughts on something.It's about adding the www. to every domain,in particular domains that ends with .info .online .biz .top etc etc. They're all either malicious or adware but they always are activated in the background and from my observations so far(going trough many page codes) they never use www..I consider them more like subdomains then anything else.Maybe you can skip the generation of www. on those if it will save some resources. Here is one example: Capture

Originally posted by @dnmTX in https://github.com/Ultimate-Hosts-Blacklist/dev-center/issues/28#issuecomment-478798528

dnmTX commented 5 years ago

Well,if it's cheaper to generate www. on all of them then there is no point to do it when looking to speedup/performance filtering. Another upside that i see is less entries for the resolvers to deal with and still,as you said,deeper research is needed on this one to avoid false positives. I guess we can leave it for feature consideration for now.

funilrys commented 5 years ago

Well, in the current point of view, it is cheap but it can and may be cheaper in the future.

Indeed I had to stop the development of the next management tool in profit of PyFunceble 2.x.x (cf.) which in-term will let us use its (powerful !?) API instead of the CLI.

That way we can take advantage of the multiprocessing module which will help up increase our speed drastically.

But yeah 2.x.x is still a draft but as I previously stated, it is written in my free time :smile: I hope to have it soon in dev so I can finish the next management tool.

As stated, deeper research is needed on this :smile_cat:

funilrys commented 4 years ago

Closing as it is really cheap.

spirillen commented 4 years ago

Why not do a real lookup to see if it is actually existent...

dig "www.$url" | grep -vF 'NXDOMAIN' | grep -F 'IN' | awk '/^#|^$|;|gtld-server/{ next };{ printf ("%s\n",$1) }' <= if true add www.

funilrys commented 4 years ago

@spirillen our infrastructure already handle a simular logic.

We generate them all (with and without www.) for domain (not sub-domain) send them to PyFunceble for testing and distribute the results.

It's working properly since months 😁 You habe to note that we test both www.example.com if example.com is given and vice-versa. 😁

If you use Pyfunceble from the CLI - -complement do that. Here we don't use the CLI but the API which gives us a lot of flexibility but the result is the same.