StevenBlack / hosts

πŸ”’ Consolidating and extending hosts files from several well-curated sources. Optionally pick extensions for porn, social media, and other categories.
MIT License
26.58k stars 2.21k forks source link

[tyzbit list] dead domains #537

Closed dnmTX closed 6 years ago

dnmTX commented 6 years ago

Due to list being short and not updated for a long time i did nslookup+pinged the resulted IP's(if what i did is not enough to curate it let me know) which shows that most of them are dead.Before any removal is done please somebody to double check my findings.This is the list of the domains that are still ACTIVE and if i'm right in order to reduce some clutter the rest can be removed:

choice.microsoft.com cs1.wpc.v0cdn.net i1.services.social.microsoft.com pre.footprintpredict.com redir.metaservices.microsoft.com

CC @StevenBlack

welcome[bot] commented 6 years ago

Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better!

StevenBlack commented 6 years ago

Thank you @dnmTX – bringing this to my attention is valuable.

I agree that this resource isn't actively curated anymore. @tyzbit do you plan on maintaining this list of domains?

tyzbit commented 6 years ago

No, I got tired with keeping up with Windows 10's constant cat-and-mouse game against its users and switched to Mint. The source I got the domains from originally is the Debloat Windows 10 code, which was updated in the last 10 months.

Personally, unless there's some negative impact I'd be for keeping the domains just in case.

Perhaps separate from this specific issue, my list can be updated and replaced using the domains in the repo I linked. I do recall an issue with Skype or something being blocked by the more aggressive "#extra" block, so some care would be needed.

funilrys commented 6 years ago

Dead-domains can be found at https://github.com/dead-hosts/hosts_git_tyzbit/tree/master/output/domains πŸ˜‰

dnmTX commented 6 years ago

Ok...i made a list with only the active once from the repo that @tyzbit linked.Looks like whoever made that script gave up also so maybe extra extension be better so at least down the road to be revisited and dead domains to be removed again.Up to you @StevenBlack Microsoft.txt

funilrys commented 6 years ago

Nice @dnmTX :+1:

The advantage of https://github.com/dead-hosts and https://github.com/funilrys/PyFunceble, in general, is that we continually test a list which means, we keep a track on dead-domain and retest them over time. This way if one becomes active we put it into the list of active :wink:

dnmTX commented 6 years ago

@funilrys i'm just trying to participate considering my limited knowledge but the @tyzbit's list wasn't updated for a very long time and i had some extra time on my hands and i do count on you guys to revisit my work and take the final decision.Me,still using windows that list is kind of important to be up to date.

funilrys commented 6 years ago

No problem what you're doing is still great :+1:

Keep in mind that you can create and distribute a list of ACTIVE like we (with @mitchellkrogza) actually do on BaddBoyzHosts but in other hands, you should always get, pull and format the latest upstream source then test it as it can be updated.

Also, I took a look at https://github.com/W4RH4WK/Debloat-Windows-10/blob/master/scripts/block-telemetry.ps1 and it seems like it has not been updated since 2 Oct 2017 how can it be more accurate than @tyzbit list then? :thinking:

funilrys commented 6 years ago

And if we look at the last introduction of a domain the last time was at https://github.com/W4RH4WK/Debloat-Windows-10/commit/9ed991fc61fbe55b10664c55c5288da985a49304#diff-fd53e923fd22cb902f42efe43bde69a6 on 15 May 2017 which give us almost a year :thinking: It's up to you Steven @StevenBlack but just saying ...

dnmTX commented 6 years ago

Well considering what @tyzbit said,he made his list originally from the script(long time ago...i guess) and overtime the script list got updated cause if you match them both the script list contain more entries and also when i did nslookup most of them point to microsoft anyway.I can't say that it's super accurate but at least it's something and the fact that there are still active domains on that list means that we still got some protections. P.S. probably is not a bad idea to keep at least the active once for now....i guess......

funilrys commented 6 years ago

Okay, then @tyzbit or @StevenBlack can update that list by removing all inactive domains from https://github.com/dead-hosts/hosts_git_tyzbit/blob/master/output/domains/INACTIVE/list or somewhere else ...

tyzbit commented 6 years ago

Actually, while we're thinking about it, storing my list where it is wasn't a great design in the first place, I'd be open to making it a small gist or something, handing it off to someone else and at the same time, updating the list.

dnmTX commented 6 years ago

As a reference i found this list ( https://github.com/crazy-max/WindowsSpyBlocker/blob/master/data/hosts/win10/spy.txt ) .The list is much bigger then the two mentioned above and it says that it's been updated like 10 days ago.As i suspected most of the domains are inactive(i did nslookup on most of them to be sure).I'm not sure adding such a big list with so many inactive domains is a good idea so i'll leave it to you guys to decide.Either way it's a start towards improving the @tyzbit's list.

tyzbit commented 6 years ago

I took the content of both lists(WindowsSpyBlocker and Debloat Windows 10), merged and deduped them, and checked each domain against Google and OpenDNS to get a list of currently active domains. I put the result at this gist and in theory you should be able to use https://gist.githubusercontent.com/tyzbit/83b3297cb8ce1e5af4c90d232b1f5886/raw to source it into the repo.

Notes: I didn't include the "#extra" block from Debloat Windows 10, and I can't test these domains for compatibility since I don't have a Windows 10 box available. It looks like the resulting list also doesn't include any hosts in the dead hosts list. I did confirm that there are no hosts in my previous list that are still active but not included, so this does not remove any blocking besides inactive hosts.

Can someone on Windows 10 test this new host list that it doesn't break anything like Skype, and Windows Update? Refer to #104 #145 for discussions about possible issues with my previous list.

StevenBlack commented 6 years ago

Thanks @tyzbit !

dnmTX commented 6 years ago

@tyzbit WindowsSpyBlocker has separate list to block windows update ( https://github.com/crazy-max/WindowsSpyBlocker/blob/master/data/hosts/win10/update.txt ) so i think there shouldn't be any problems there,for the rest if there are any issues the problematic domains can always be removed at later date.Thanks for making the list.

StevenBlack commented 6 years ago

I'm really skittish about blocking anything from windowsupdate.com. This would rule out WindowsSpyBlocker as a source of hosts.

Here's what I'm thinking: add @tyzbit 's Gist as a source now, and see what hackles this raises. In some sense, given all the platforms we support, and apps like Skype supports, the only way to find out is try, and see if anyone complains.

dnmTX commented 6 years ago

@StevenBlack maybe you misundertstood my last post but i meant the same thing(i was just reassuring @tyzbit that there shouldn't be any problems using windows update with his list).I'm Windows user and Win Update is need it here and shouldn't be block in general.Yeah let's give it a try and see what's what.

StevenBlack commented 6 years ago

Hmmm. One downside: Gist links to 'raw' reference commit hashes explicitly.

I'm going to go with this now, but this needs a better source URL solution soon... cc @tyzbit

StevenBlack commented 6 years ago

OK, live and online in Release 1.3.1.

Thanks for the input, everybody!

tyzbit commented 6 years ago

I was hoping https://gist.githubusercontent.com/tyzbit/83b3297cb8ce1e5af4c90d232b1f5886/raw would always point to master, since gists are just git repos with long names. A URL that references a specific commit hash would be https://gist.githubusercontent.com/tyzbit/83b3297cb8ce1e5af4c90d232b1f5886/raw/8907b7eb2645e34f29c81060228785c205e8b9e9/hosts, which is the current one. If the gist gets updated, my expectation is that the first URL should be updated. Is that not the case?

I'm aware that GitHub does do aggressive caching, though.

StevenBlack commented 6 years ago

I just tested this. Yup, you're right; the part including .../raw/ doesn't mutate. So this is fine.

StevenBlack commented 6 years ago

Closing this now. We'll revisit if anyone complains πŸ˜ƒ