StevenBlack / hosts

🔒 Consolidating and extending hosts files from several well-curated sources. Optionally pick extensions for porn, social media, and other categories.
MIT License
25.71k stars 2.15k forks source link

List of Windows 10 tracking / telemetry / ads hosts? #155

Closed Gitoffthelawn closed 7 years ago

Gitoffthelawn commented 7 years ago

I want to do some research into this whole Windows 10 tracking / telemetry / ads fiasco.

To give me a head start, is there a list of known hosts used for those purposes?

IOW, not a list of hosts for used by every advertising / tracking provider in the world: just a specific list for hosts used by Windows 10 for tracking / telemetry / ads.

berrythesoftwarecodeprogrammar commented 7 years ago

https://github.com/StevenBlack/hosts/issues/154#issuecomment-236422378

berrythesoftwarecodeprogrammar commented 7 years ago

i also just came across this so im going to try it https://www.safer-networking.org/spybot-anti-beacon/

berrythesoftwarecodeprogrammar commented 7 years ago

seems like a good program, added some extra protection. the only extra host it added was:

0.0.0.0 choice.microsoft.com.nstac.net
Gitoffthelawn commented 7 years ago

@berrythesoftwarecodeprogrammar Thanks so much Berry. Are there any curated lists that are being kept updated for this purpose? With the forced-updates design of Windows 10, MS can add/modify which hosts all Windows 10 systems connect to at any time (or even have some systems connect to certain hosts, while other systems connect to different hosts).

For the spybot anti beacon software you mentioned, did it just include that one host, or was that just the only host that was not included in this repo?

berrythesoftwarecodeprogrammar commented 7 years ago

@Gitoffthelawn that was just the only one not included in the other list i pasted. which was a list i think i extracted from the Destroy Windows Spying program, or something similar

berrythesoftwarecodeprogrammar commented 7 years ago

There are just a variety of programs which aim to stop windows spying, especially W10 related stuff. A bunch of them bundle hosts entries with them and I put together the ones which I used. I trust the Spybot team, since I've used their software for a long time and I think they would have the most reliable and up to date list of hosts.

berrythesoftwarecodeprogrammar commented 7 years ago

Here are the previous links and more, for anyone who hasn't seen the other thread:

https://www.privacytools.io/#win10 -- Links to various tools and information about Windows 10 spying https://fix10.isleaked.com/ -- Guide to disabling most of the bad features in Windows 10 https://fix10.isleaked.com/oldwindows.html -- Guide to removing bad updates from Windows 7/8 https://www.safer-networking.org/spybot-anti-beacon/ -- Software to stop telemetry in Windows 7/8/10 http://dws.wzor.net/ -- Software to stop and/or remove unwanted features of Windows 7/8/10 http://ultimateoutsider.com/downloads/ -- GWX Control Panel; Prevent Windows 7/8 from updating to 10

I use all of the software above since they each have their own special features.

berrythesoftwarecodeprogrammar commented 7 years ago

I use the tyzbit hosts file and I have these Microsoft/Windows related entries in my myhosts file:

0.0.0.0 a-0001.a-msedge.net
0.0.0.0 a.ads1.msn.com
0.0.0.0 a.ads2.msn.com
0.0.0.0 ad.doubleclick.net
0.0.0.0 adnexus.net
0.0.0.0 adnxs.com
0.0.0.0 ads.msn.com
0.0.0.0 ads1.msads.net
0.0.0.0 ads1.msn.com
0.0.0.0 az361816.vo.msecnd.net
0.0.0.0 az512334.vo.msecnd.net
0.0.0.0 ca.telemetry.microsoft.com
0.0.0.0 cache.datamart.windows.com
0.0.0.0 choice.microsoft.com
0.0.0.0 choice.microsoft.com.nsatc.net
0.0.0.0 choice.microsoft.com.nstac.net
0.0.0.0 compatexchange.cloudapp.net
0.0.0.0 corp.sts.microsoft.com
0.0.0.0 corpext.msitadfs.glbdns2.microsoft.com
0.0.0.0 cs1.wpc.v0cdn.net
0.0.0.0 db3wns2011111.wns.windows.com
0.0.0.0 df.telemetry.microsoft.com
0.0.0.0 diagnostics.support.microsoft.com
0.0.0.0 fe2.update.microsoft.com.akadns.net
0.0.0.0 fe3.delivery.dsp.mp.microsoft.com.nsatc.net
0.0.0.0 feedback.microsoft-hohm.com
0.0.0.0 feedback.search.microsoft.com
0.0.0.0 feedback.windows.com
0.0.0.0 i1.services.social.microsoft.com
0.0.0.0 i1.services.social.microsoft.com.nsatc.net
0.0.0.0 msnbot-207-46-194-33.search.msn.com
0.0.0.0 oca.telemetry.microsoft.com
0.0.0.0 oca.telemetry.microsoft.com.nsatc.net
0.0.0.0 pre.footprintpredict.com
0.0.0.0 preview.msn.com
0.0.0.0 rad.msn.com
0.0.0.0 redir.metaservices.microsoft.com
0.0.0.0 reports.wes.df.telemetry.microsoft.com
0.0.0.0 s0.2mdn.net
0.0.0.0 services.wes.df.telemetry.microsoft.com
0.0.0.0 settings-sandbox.data.microsoft.com
0.0.0.0 settings-win.data.microsoft.com
0.0.0.0 settings.data.microsof.com
0.0.0.0 sls.update.microsoft.com.akadns.net
0.0.0.0 spynet2.microsoft.com
0.0.0.0 spynetalt.microsoft.com
0.0.0.0 sqm.df.telemetry.microsoft.com
0.0.0.0 sqm.telemetry.microsoft.com
0.0.0.0 sqm.telemetry.microsoft.com.nsatc.net
0.0.0.0 ssw.live.com
0.0.0.0 statsfe1.ws.microsoft.com
0.0.0.0 statsfe2.update.microsoft.com.akadns.net
0.0.0.0 statsfe2.ws.microsoft.com
0.0.0.0 survey.watson.microsoft.com
0.0.0.0 telecommand.telemetry.microsoft.com
0.0.0.0 telecommand.telemetry.microsoft.com.nsatc.net
0.0.0.0 telemetry.appex.bing.net
0.0.0.0 telemetry.microsoft.com
0.0.0.0 telemetry.urs.microsoft.com
0.0.0.0 v10.vortex-win.data.microsoft.com
0.0.0.0 view.atdmt.com
0.0.0.0 vortex-sandbox.data.microsoft.com
0.0.0.0 vortex-win.data.microsoft.com
0.0.0.0 vortex.data.microsoft.com
0.0.0.0 watson.live.com
0.0.0.0 watson.microsoft.com
0.0.0.0 watson.ppe.telemetry.microsoft.com
0.0.0.0 watson.telemetry.microsoft.com
0.0.0.0 watson.telemetry.microsoft.com.nsatc.net
0.0.0.0 wes.df.telemetry.microsoft.com
0.0.0.0 win10.ipv6.microsoft.com

(Extracted fromthe hosts files which those programs install, since I use this script to create my hosts file and don't want to have to run external programs everytime I update my hosts)

StevenBlack commented 7 years ago

Well, we already have someonewhocares.org.

Gitoffthelawn commented 7 years ago

@CHEF-KOCH Thanks, but that list contains much more than just MS Win10 stuff... well, at least I hope so! :wink:

ghost commented 7 years ago

Domains added by DisableWinTracking (the most popular Win 10 anti-spying script back then) prevented the store app from downloading updates, potentially no longer the case: https://github.com/10se1ucgo/DisableWinTracking

Atavic commented 7 years ago

I see as a plus when something like the store doesn't work: it means that the block works.

FadeMind commented 7 years ago

@StevenBlack this issue can be closed cause

https://github.com/StevenBlack/hosts/commit/590eeac32ce1a2a6321ba577f732c6c017334827#diff-c36f927ba928cc2158b97e706cc80057R27503

Regards

Gitoffthelawn commented 7 years ago

@FadeMind That's a very useful diff, thank you!

But since it's a static diff, it doesn't provide a continually up-to-date source as the relevant hosts will undoubtedly change over time.

FadeMind commented 7 years ago

@Gitoffthelawn see: https://github.com/StevenBlack/hosts/pull/185 and https://github.com/FadeMind/hosts.extras

StevenBlack commented 7 years ago

Thanks Tomasz @FadeMind.

Gitoffthelawn commented 7 years ago

@FadeMind Thank you Tomasz!

monstertruckpa commented 6 years ago

Hey steven how are u, thanks for the hostslist, but i need in ".txt" mode and direct link, not in a "raw version". please, you could to create in text format for that i can add yours host in "Hosts manager from abelhas". it doesn't not allow to add list in raw without extensions.

monstertruckpa commented 6 years ago

CHEF-KOCH CANNOT TO DO WHAT YOU SAY. https://lut.im/vnr0ezcAW4/ClnsNV0e9CCMsYgT.png

monstertruckpa commented 6 years ago

http://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/social/hosts

StevenBlack commented 6 years ago

Hi @monstertruckpa can you try using a non-Github mirror link – see the last column in the table in the readme.

http://sbc.io/hosts/hosts

Or, say,

http://sbc.io/hosts/alternates/social/hosts

monstertruckpa commented 6 years ago

@STEVENBLACK The secondary mirror filehost, works perfectly!, gave me not problems. thanks steven, im so happy with abelhas program updater + your yappaplus hosts block list. YUPI!

ScriptTiger commented 6 years ago

I like your ideas, @CHEF-KOCH. Modern Windows versions come standard with the Windows Firewall which can also block IP ranges/networks. I have a geoIP project I started for local resolutions using the GeoLite2 files from MaxMind, and this could easily be integrated with Steven Black's hosts files by first resolving all the NS records for a domain, and then resolving those IPs to networks/ASNs as you have described above. I see you don't yet support IPv6, I have also been slow to support it just because I have been a bit lazy to implement the calculations for it. But making a seamless process to join Steven Black's hosts files + GeoLite2 + Windows Firewall would be a great smart firewall that doesn't require any additional executables and can be solely scripted to do the resolutions from the hosts files and then configure the firewall. I realize this does nothing for your Linux project, but I just wanted you to know I have definitely been inspired by your comments.

Atavic commented 6 years ago

Wait, you are setting rules to Firewall as ADMINISTRATOR.

Windows 10 by default can bypass certain FDQN's

That's because the OS has user profiles with more privileges than Admin (You).

Just use Process Explorer by Sysinternals and look at the properties of some svchost.exe instances. You'll see:

USERDOMAIN: NT AUTHORITY
USERNAME: LOCAL SERVICE

This user profile is the system itself and is greater than Admin.

Atavic commented 6 years ago

LOCAL SERVICE can override the Firewall rules set by ADMINISTRATOR

monstertruckpa commented 6 years ago

Thank You CHEFKOCH for your explanation. very helpful all what you tell us.

ScriptTiger commented 6 years ago

@Tobias-B-Besemer, maybe you're crew might be interested in another project? A binary package to configure both the hosts file and Windows Firewall (see my post above). I have been dabbling in my geoIP project for a while just because I do have a day job and haven't gotten to completing the actual search script yet, but the core calculations for IPv4 are all there if you want to use my script as pseudo code for your project. Or you can wait for a while for me to draft up a script which actually does all that and then you can use it as pseudo code and port it to C#.

Tobias-B-Besemer commented 6 years ago

CC @D4rkCr0w, as he make the code...

(For the others, we talk about: https://github.com/LV-Crew/HostsManager/ ) (Issue-Reports are welcome!)

StevenBlack commented 5 years ago

I'm not sure I understand, @maravento.

The whiteurls list will need modification to prefix 0.0.0.0 or 127.0.0.1 on each line, then you'll need to add its details to an update.json file, then generate your own hosts file using updatehosts.py.

Is that what you're asking?

ScriptTiger commented 5 years ago

There is no separate extension or specific data source for telemetry, they are all grouped together with malware, adware, and other generic unwanted players. You would have to go through and manually read the comments to separate out telemetry from everything else. Our sources are usually pretty well commented, so that is probably your best option.

ScriptTiger commented 5 years ago

If someone wants to take on a personal project, you could reach out to each of the data sources to see if they would be willing to use standardized tagging of some kind using comments to categorize each entry for why they curate each entry. If all of the data sources can agree on a standard, then the aggregate here can easily be filtered through a script to separate out categories further. There's nothing really we can do downstream once it gets to our end though and all of the data sources use their own disparate curation methods

Tobias-B-Besemer commented 5 years ago

There exist a PowerShell project https://github.com/W4RH4WK/Debloat-Windows-10 but there seems to be a lot of issues with it (https://github.com/Microsoft/MS-DOS/issues/395#issuecomment-478330019).

rautamiekka commented 5 years ago

There exist a PowerShell project https://github.com/W4RH4WK/Debloat-Windows-10 but there seems to be a lot of issues with it (Microsoft/MS-DOS#395 (comment)).

Don't use the one by W4RH4WK ! It's buggy beyond functional, use https://github.com/Sycnex/Windows10Debloater instead.

CraigHead commented 4 years ago

Just an FYI, blocking watson.telemetry.microsoft.com caused XBox Update to fail to start the updating process with the latest Xbox OS patch. I'm leaving this comment here for posterity. Maybe it'll help someone else.

Gitoffthelawn commented 4 years ago

@CraigHead Thanks for posting Craig. That's real ugly behavior if MS is breaking things when telemetry is blocked.

ScriptTiger commented 4 years ago

Most of the mentioned domains are in no relationship with telemetry and added based on pure speculation as well as "guessing" e.g. the live.login domain.

Interestingly I was working on a Windows 10 machine and I accidentally clicked on some help dialog and Microsoft Edge popped open, which is not the default Web browser, and AVG actually popped up with a threat detection and made the following report:

  • AVG Real-time Shield Scan Report
  • This file is generated automatically
  • Started on: Thursday, February 6, 2020 4:16:13 PM

2/6/2020 6:09:19 PM https://login.live.com/login.srf? CENSORED - [L] URL:Phishing (0)

I censored the full URL from the report, but obviously the Avast team agrees that login.live.com is a threat indicator. Anybody else get any similar reports from other antivirus or similar software? I am by no means saying I agree with everything the Avast team has to say, but I will say there does seem to be a professional consensus around this particular topic. I'll also note that login.live.com is not currently on any of our blacklists.

From doing some quick research, it seems like login.live.com has been involved in several high-volume e-mail phishing scams. I cannot say if the domain itself is malicious, but I believe malicious players are using it to such a degree in phishing campaigns as to make the domain itself an indicator of possible threat, even though it itself may not be a threat itself.

ScriptTiger commented 4 years ago

I completely agree with everything you've said, @CHEF-KOCH, don't get me wrong. Obviously the highest profile domains are the ones that are going to get attacked the most. However, that varies slightly from this particular case of BEC, business e-mail compromise, which is currently the number one fraud threat for organizations. I am not talking about attacking Microsoft, I am talking about misrepresenting Microsoft domains to target Microsoft end users with phishing scams, which in turn degrades the reputation data of otherwise known good domains by associating them at high volume to threat actors.

So the real targets here are not the owners of the high-profile domains, but rather the end users, which are obviously many in number and work well with cold e-mails and doesn't necessarily have to employ any special spear phishing tactics to be effective. High-profile domains getting flagged as a result of this reputation data is just collateral damage, possibly even secondary targets, but they are not the primary targets or focus of these attacks.

I think the thing to remember for all of us, especially those security professionals among us, is that although we may have been trained to subconsciously detect phishing e-mails and spam at this point, this vector is still wildly successful and is still a serious threat to those some of us are charged to protect. I still feel perplexed every time a family member or friend comes to me and tells me how deep they got pulled into an e-mail scam of some kind. They just don't have the same mental filters of those technically and security-inclined, as such things are just not top of mind for the average person.

Obviously no matter how you take any of this it still may seem like a silly idea to flag such high-profile domains based solely on this reputation data, especially since the domain operators did nothing wrong and their domains pose absolutely no risk by themselves. However, most threat protection is automated to a high degree these days and even high-profile domains are not differentiated if the indicators are all there to raise alarm. And I am by no means making an excuse for the possible high number of false positives, but the systems are currently set up to as rapidly as possible respond to threats and protect the greatest number of people possible, which is obviously going to have many errors like any numbers game.

Domains owned by Netflix and PayPal also get flagged all the time for similar reasons. It's basically impossible to stop them from popping up on blacklists during sustained high-volume phishing campaigns or times where it's trending to misrepresent specific companies in such campaigns because of the sheer volume of data being generated, albeit possibly completely irrelevant data. For the most part, hackers are "monkey see, monkey do" and try copying attacks they read about in the news, so a lot of times you'll see Netflix scams trending for a while, PayPal scams, Microsoft scams, etc. Again, I am not making any excuses, just putting it out there that these things will inevitably happen until our current implementations of machine learning and artificial intelligence can discern such situations on their own.