Ultimate-Hosts-Blacklist / dev-center

The place to talk about our infrastructure or everything related to the Ultimate Hosts Blacklist project.
MIT License
11 stars 2 forks source link

whitelist evaluation #15

Closed dnmTX closed 5 years ago

dnmTX commented 5 years ago

@lightswitch05 @smed79 @funilrys @anudeepND @quidsup

As i'm little hesitant to start removing domains on which i have no prior knowledge of i decided to compare each entry from the whitelist against various lists to see if there are any matches.Lists that i used for the comparison: @lightswitch05 @anudeepND @StevenBlack @justdomains @quidsup I'll be listing here the whitelist entry,domains found that matched and lists that were found in. Due to whitelist is kind of big(1000+) i'll be doing it in alphabetical order,once we done with one i''ll move to the next letter down.The point of this is everyone here to review it and either agree or disagree before it gets removed. Unless commenting is necessary use thumbs UP or DOWN just to keep it cleaner here. @funilrys you got the honor to remove or leave the entrie(s),once confirmed by you i''ll move to the next one. Thank you all for participating.Here you go: A ALL .digitaltrends.com 0.0.0.0 vstats.digitaltrends.com -stevenblack ALL .gov -complicated/work in progress ALL .twimg.com oem.twimg.com #Twitter possible Windows spying -quidsup ALL .ubuntu.com popcon.ubuntu.com #Ubuntu unaproved tracking -quidsup

lightswitch05 commented 5 years ago

The point of this is everyone here to review it and either agree or disagree before it gets removed.

Just for clarity, this is a list of domains that are currently in the whitelist, and this review is for removing them from the whitelist. Right?

funilrys commented 5 years ago

Okay let's go, but first let me reexplain:

ALL .gov == .*\.gov$

that means that we match everything which ends with .gov :wink: So for example cmicapui.ce.gov.br is not excluded/whitelisted with ALL .gov :smile_cat:

@lightswitch05 Yes this is a review of the whitelist :+1:

Side Note

If a domain is deleted, I think that it may be better to remove it from the core whitelist list and add it into a new (or an existent) category of https://github.com/Ultimate-Hosts-Blacklist/dev-center/tree/whitelisting/data

Keep up the good work :+1: Cheers, Nissar

dnmTX commented 5 years ago

OK,i did some adjustments to my first post. @funilrys

In the future we could then update that line with all subdomains which do not have to be whitelisted with something like REG ^(?!..?(vstats|subdomaintoexclude))..digitaltrends.com$ šŸ‘

this should apply for ALL .twimg.com and ALL .ubuntu.com as well

P.S. Remember,once you do the changes to let me know so i can move to the next letter down.

dnmTX commented 5 years ago

@lightswitch05

Just for clarity, this is a list of domains that are currently in the whitelist, and this review is for removing them from the whitelist. Right?

Yes,the whitelist is way too agressive at the moment so i'm just making it easier for us to see which ones needs permanent removal.

funilrys commented 5 years ago

@dnmTX All checked in the list (on my previous comment) let's move on :+1:

funilrys commented 5 years ago

@lightswitch05 Actually the elements which were on the whitelist list were all completely bold lines as you can see in my previously linked commits.

dnmTX commented 5 years ago

B b.scorecardresearch.com

funilrys commented 5 years ago

Note: I edited your comment so that I can have checkbox

funilrys commented 5 years ago

@dnmTX All :heavy_check_mark:, let's move on :+1:

dnmTX commented 5 years ago

C c.msn.com

.....to be continue(tomorrow first thing)

dnmTX commented 5 years ago

@funilrys if whitelist entry is for example cnn.com is that exact match or it will match also sub-domains,basically anything in the front like mms.cnn.com ? Just to narrow my search better and not waste time. P.S. Please don't get to technical,basic explanation will suffice.Thanks.

funilrys commented 5 years ago

@dnmTX exactmatch.com and www.exactmatch.com.

funilrys commented 5 years ago

@dnmTX All :heavy_check_mark:, let's move on :+1:

dnmTX commented 5 years ago

D directvapplications.hb.omtrdc.net

funilrys commented 5 years ago

Let's ask @xxcriticxx opinion on that last one express.co.uk as it was his request mitchellkrogza/Ultimate.Hosts.Blacklist#424 :+1:

@dnmTX Let's move on, I'll check that later. I'm not against the "fake news" extension of Steve but it is really based on opinion sometimes ...

dnmTX commented 5 years ago

G g.msn.com

dnmTX commented 5 years ago

H hbogo.com 0.0.0.0 hbogo.com.112.207.net -lightswitch(just in case,could be a match) I ipinfo.io ipinfo.io #IPinfo -quidsup

funilrys commented 5 years ago

:heavy_check_mark: Let's move on :+1:

dnmTX commented 5 years ago

@funilrys what about gekko.spiceworks.com?

funilrys commented 5 years ago

I'm not removing ipinfo.io. Let's move on to next letter :+1:

dnmTX commented 5 years ago

L l.betrad.com

funilrys commented 5 years ago

Some hosts file have a line like

127.0.0.1 localhost

that line removes it as we generate our own "header" :smile_cat:

funilrys commented 5 years ago

@dnmTX :heavy_check_mark:

dnmTX commented 5 years ago

doing it by hand,that's why it's slow but don't want to miss something.

dnmTX commented 5 years ago

M metrics.plex.tv metrics.plex.tv #PleX -quidsup

funilrys commented 5 years ago
funilrys commented 5 years ago

:heavy_check_mark:

dnmTX commented 5 years ago

special request about openload.co domain(s).If you can expend after the dot . to cover more possibilities. Anything with two letters to be considered.

dnmTX commented 5 years ago

O oas.monster.com 0.0.0.0 oas.monster.com -stevenblack

funilrys commented 5 years ago

@dnmTX :heavy_check_mark:

dnmTX commented 5 years ago

P pixel.facebook.com

dnmTX commented 5 years ago

@funilrys i'd need more info on REG ^ebay.(?:[a-z\.]+)$ to know what exactly is covered by it.

funilrys commented 5 years ago

@dnmTX :heavy_check_mark:

@dnmTX ==> All ebay.xyz domains :)

dnmTX commented 5 years ago

R rad.msn.com

dnmTX commented 5 years ago

@funilrys not much left and i'll finish them all tomorrow.When you have time just do the changes. šŸ‘

funilrys commented 5 years ago
funilrys commented 5 years ago

@dnmTX :heavy_check_mark:

dnmTX commented 5 years ago

Please report to mitchellkrogza/Ultimate.Hosts.Blacklist#400 for syndication.twitter.com

@lightswitch05 you have it in your lists.Does it brake the player really?

dnmTX commented 5 years ago

V v10.vortex-win.data.microsoft.com 0.0.0.0 v10.vortex-win.data.microsoft.com -anudeep,stevenblack

video-stats.l.google.com -very suspicious(not present anywhere) W webcache.googleusercontent.com 0.0.0.0 webcache.googleusercontent.com.pathful.com -lightswitch(possible match)

That's it.Done and done :smile: I'll be monitoring the whitelisted.list for the feature and if anything i'll report it here šŸ‘

lightswitch05 commented 5 years ago

I use the twitter website daily with syndication.twitter.com blocked and I have never had any issues. However, I don't ever log into my account - I just browser a couple public profiles. I also don't use the app, so maybe I'm not an average user - but blocking this doesn't affect me

dnmTX commented 5 years ago

Thanks @lightswitch05 . @funilrys :point_up:

funilrys commented 5 years ago

@dnmTX @lightswitch05 @smed79 Thanks for the inputs :+1:

Keep up the good work :rocket: :tada:

Cheers, Nissar

dnmTX commented 5 years ago

:hugs:

dnmTX commented 5 years ago

@funilrys i understand that all the changes that we did here are not merged yet but i did check the whitelisted.list in justdomains_ to see what was removed from the clean.list and these are the results: google.maniyakat.cn googlecentreservices.rockhillrealtytx.com googledrivedocument.beechdrift.co.uk googlegetmyphotos.pythonanywhere.com googlegetmysyncphotos.pythonanywhere.com googlegetphotos.pythonanywhere.com

None of those appear to be in the whitelist and you know,they're all malicious and should be blocked. Please inspect.Thank you.

P.S. looks to me that everything that starts with google... has been whitelisted.

dnmTX commented 5 years ago

Just FYI. Once the changes got merged and due to this post here became very long,i'll be opening a new issue as a continuation of this one here to report further if any more changes/adjustments needs to be done.

funilrys commented 5 years ago

@dnmTX Fixed. All next generation of the whitelisted.list from now will do things the right way.

dnmTX commented 5 years ago

@funilrys there is something wrong in the lightswitch05 repo since the last filtering: INVALID\hosts -duplicates INACTIVE\hosts -lots and lots of duplicates... ...not sure what else but right now it's a mess.

funilrys commented 5 years ago

same at @dead-hosts ? If not then it's because of the Travis :thinking:

Will check that later.

dnmTX commented 5 years ago

Well...at @dead-hosts is in filtering so i don't know.I'm kind of focused on things here at the moment.I don't even visit there as of late.