Closed XhmikosR closed 4 years ago
Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better!
2 ==> #1165
1 ==> It comes from the readme_template. It's probably because you're not reading in UTF-8
?
I got frustrated with this and just started my own repo, but it's good to see some work being done on this again.
@funilrys about the second point, there's still at least once we end up with backslashes, see https://github.com/StevenBlack/hosts/pull/1165#issuecomment-590111613
About the UTF issue, I didn't do anything myself, I just ran the scripts.
@StevenBlack there's still an issue with backslashes, see my comment above https://github.com/StevenBlack/hosts/issues/1166#issuecomment-590247036
BTW what's the exact scripts/command you run @StevenBlack to generate the files? I'm asking because whenever I try it on Windows I get too many changes in each file
@XhmikosR I haven't used Windows in at least 10-years. I'm on MacOS and Ubuntu.
I run makeHosts.py which generates all the variants, in turn.
What do you mean, "I get too many changes in each file"?
@StevenBlack understandable, but please re-open the issue until we manage to fix everything; we are so close :)
C:\Users\xmr\Desktop\hosts>git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
C:\Users\xmr\Desktop\hosts>python makeHosts.py
Updating source data\adaway.org from https://raw.githubusercontent.com/AdAway/adaway.github.io/master/hosts.txt
Updating source data\add.2o7Net from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.2o7Net/hosts
Updating source data\add.Dead from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Dead/hosts
Updating source data\add.Risk from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Risk/hosts
Updating source data\add.Spam from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Spam/hosts
Updating source data\Badd-Boyz-Hosts from https://raw.githubusercontent.com/mitchellkrogza/Badd-Boyz-Hosts/master/hosts
Updating source data\hostsVN from https://raw.githubusercontent.com/bigdargon/hostsVN/master/option/hosts-VN
Updating source data\KADhosts from https://raw.githubusercontent.com/PolishFiltersTeam/KADhosts/master/KADhosts_without_controversies.txt
Updating source data\malwaredomainlist.com from https://www.malwaredomainlist.com/hostslist/hosts.txt
Updating source data\mvps.org from https://winhelp2002.mvps.org/hosts.txt
Updating source data\someonewhocares.org from https://someonewhocares.org/hosts/zero/hosts
Updating source data\StevenBlack from https://raw.githubusercontent.com/StevenBlack/hosts/master/data/StevenBlack/hosts
Updating source data\tiuxo from https://raw.githubusercontent.com/tiuxo/hosts/master/ads
Updating source data\UncheckyAds from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/UncheckyAds/hosts
Updating source data\yoyo.org from https://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&mimetype=plaintext&useip=0.0.0.0
Updating source extensions\fakenews from https://raw.githubusercontent.com/marktron/fakenews/master/fakenews
Updating source extensions\gambling from https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/gambling-hosts
Updating source extensions\porn\clefspeare13 from https://raw.githubusercontent.com/Clefspeare13/pornhosts/master/0.0.0.0/hosts
Updating source extensions\porn\sinfonietta from https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/pornography-hosts
Updating source extensions\porn\sinfonietta-snuff from https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/snuff-hosts
Updating source extensions\porn\tiuxo from https://raw.githubusercontent.com/tiuxo/hosts/master/porn
Updating source extensions\social\sinfonietta from https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/social-hosts
Updating source extensions\social\tiuxo from https://raw.githubusercontent.com/tiuxo/hosts/master/social
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/gambling
It contains 54,006 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/porn
It contains 67,654 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/social
It contains 54,153 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/fakenews
It contains 52,627 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/fakenews-gambling
It contains 54,948 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/fakenews-porn
It contains 68,596 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/fakenews-social
It contains 55,095 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/gambling-porn
It contains 69,975 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/gambling-social
It contains 56,474 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/porn-social
It contains 70,121 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/fakenews-gambling-porn
It contains 70,917 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/fakenews-gambling-social
It contains 57,416 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/fakenews-porn-social
It contains 71,063 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/gambling-porn-social
It contains 72,442 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder alternates/fakenews-gambling-porn-social
It contains 73,384 unique entries.
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder
It contains 51,685 unique entries.
which results in a diff like this:
alternates/fakenews-gambling-porn-social/hosts | 80242 +++++++++---------
alternates/fakenews-gambling-porn-social/readme.md | 184 +-
alternates/fakenews-gambling-porn/hosts | 80242 +++++++++---------
alternates/fakenews-gambling-porn/readme.md | 180 +-
alternates/fakenews-gambling-social/hosts | 80216 +++++++++---------
alternates/fakenews-gambling-social/readme.md | 176 +-
alternates/fakenews-gambling/hosts | 80216 +++++++++---------
alternates/fakenews-gambling/readme.md | 172 +-
alternates/fakenews-porn-social/hosts | 80244 ++++++++++---------
alternates/fakenews-porn-social/readme.md | 182 +-
alternates/fakenews-porn/hosts | 80244 ++++++++++---------
alternates/fakenews-porn/readme.md | 178 +-
alternates/fakenews-social/hosts | 80216 +++++++++---------
alternates/fakenews-social/readme.md | 174 +-
alternates/fakenews/hosts | 80216 +++++++++---------
alternates/fakenews/readme.md | 170 +-
alternates/gambling-porn-social/hosts | 80242 +++++++++---------
alternates/gambling-porn-social/readme.md | 182 +-
alternates/gambling-porn/hosts | 80242 +++++++++---------
alternates/gambling-porn/readme.md | 178 +-
alternates/gambling-social/hosts | 80216 +++++++++---------
alternates/gambling-social/readme.md | 174 +-
alternates/gambling/hosts | 80216 +++++++++---------
alternates/gambling/readme.md | 170 +-
alternates/porn-social/hosts | 80240 +++++++++---------
alternates/porn-social/readme.md | 180 +-
alternates/porn/hosts | 80240 +++++++++---------
alternates/porn/readme.md | 176 +-
alternates/social/hosts | 80216 +++++++++---------
alternates/social/readme.md | 172 +-
data/Badd-Boyz-Hosts/hosts | 2 +-
data/KADhosts/hosts | 6 +-
data/adaway.org/hosts | 4 +-
data/hostsVN/hosts | 2 +-
data/someonewhocares.org/hosts | 5 +-
data/yoyo.org/hosts | 2 +-
hosts | 80216 +++++++++---------
readme.md | 168 +-
readmeData.json | 2 +-
39 files changed, 643446 insertions(+), 643057 deletions(-)
The changes are not in the line endings, they are in the way the entries are sorted, which I can see in the Readme too:
Host file source | Description | Home page | Raw hosts | Update frequency | License | Issues
-----------------|-------------|:---------:|:---------:|:----------------:|:-------:|:------:
Steven Black's ad-hoc list | Additional sketch domains as I come across them. |[link](https://github.com/StevenBlack/hosts/blob/master/data/StevenBlack/hosts) | [raw](https://raw.githubusercontent.com/StevenBlack/hosts/master/data/StevenBlack/hosts) | occasionally | MIT | [issues](https://github.com/StevenBlack/hosts/issues)
Malware Domain List | Malware Domain List is a non-commercial community project. |[link](https://www.malwaredomainlist.com/) | [raw](https://www.malwaredomainlist.com/hostslist/hosts.txt) | weekly | 'can be used for free by anyone' | [issues](https://www.malwaredomainlist.com/contact.php)
add.Dead | Dead sites based on [hostsfile.org](http://www.hostsfile.org/hosts.html) content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Dead/hosts) | occasionally | GPLv3+ | [issues](https://github.com/FadeMind/hosts.extras/issues)
hostsVN | Hosts block ads of Vietnamese |[link](https://github.com/bigdargon/hostsVN) | [raw](https://raw.githubusercontent.com/bigdargon/hostsVN/master/option/hosts-VN) | occasionally | MIT | [issues](https://github.com/bigdargon/hostsVN/issues)
add.Spam | Spam sites based on [hostsfile.org](http://www.hostsfile.org/hosts.html) content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Spam/hosts) | occasionally | GPLv3+ | [issues](https://github.com/FadeMind/hosts.extras/issues)
Dan Pollock – [someonewhocares](https://someonewhocares.org) | How to make the internet not suck (as much). |[link](https://someonewhocares.org/hosts/) | [raw](https://someonewhocares.org/hosts/zero/hosts) | frequently | non-commercial with attribution | [issues](hosts@someonewhocares.org)
MVPS hosts file | The purpose of this site is to provide the user with a high quality custom HOSTS file. |[link](http://winhelp2002.mvps.org/) | [raw](http://winhelp2002.mvps.org/hosts.txt) | monthly | CC BY-NC-SA 4.0 | [issues](mailto:winhelp2002@gmail.com)
yoyo.org | Blocking with ad server and tracking server hostnames. |[link](https://pgl.yoyo.org/adservers/) | [raw](https://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&mimetype=plaintext&useip=0.0.0.0) | frequently | | [issues](mailto:pgl@yoyo.org)
Mitchell Krog's - Badd Boyz Hosts | Sketchy domains and Bad Referrers from my Nginx and Apache Bad Bot and Spam Referrer Blockers |[link](https://github.com/mitchellkrogza/Badd-Boyz-Hosts) | [raw](https://raw.githubusercontent.com/mitchellkrogza/Badd-Boyz-Hosts/master/hosts) | weekly | MIT | [issues](https://github.com/mitchellkrogza/Badd-Boyz-Hosts/issues)
UncheckyAds | Windows installers ads sources sites based on https://unchecky.com/ content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/UncheckyAds/hosts) | occasionally | | [issues](https://github.com/FadeMind/hosts.extras/issues)
add.2o7Net | 2o7Net tracking sites based on [hostsfile.org](http://www.hostsfile.org/hosts.html) content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.2o7Net/hosts) | occasionally | GPLv3+ | [issues](https://github.com/FadeMind/hosts.extras/issues)
KADhosts | Fraud/adware/scam websites. |[link](https://kadantiscam.netlify.com) | [raw](https://raw.githubusercontent.com/PolishFiltersTeam/KADhosts/master/KADhosts_without_controversies.txt) | frequently | CC BY-SA 4.0 | [issues](https://github.com/PolishFiltersTeam/KADhosts/issues)
AdAway | AdAway is an open source ad blocker for Android using the hosts file. |[link](https://adaway.org/) | [raw](https://raw.githubusercontent.com/AdAway/adaway.github.io/master/hosts.txt) | occasionally | CC BY 3.0 | [issues](https://github.com/AdAway/AdAway/issues)
add.Risk | Risk content sites based on [hostsfile.org](http://www.hostsfile.org/hosts.html) content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Risk/hosts) | occasionally | GPLv3+ | [issues](https://github.com/FadeMind/hosts.extras/issues)
Tiuxo hostlist - ads | Categorized hosts files for DNS based content blocking |[link](https://github.com/tiuxo/hosts) | [raw](https://raw.githubusercontent.com/tiuxo/hosts/master/ads) | occasional | CC BY 4.0 | [issues](https://github.com/tiuxo/hosts/issues)
becomes:
Host file source | Description | Home page | Raw hosts | Update frequency | License | Issues
-----------------|-------------|:---------:|:---------:|:----------------:|:-------:|:------:
AdAway | AdAway is an open source ad blocker for Android using the hosts file. |[link](https://adaway.org/) | [raw](https://raw.githubusercontent.com/AdAway/adaway.github.io/master/hosts.txt) | occasionally | CC BY 3.0 | [issues](https://github.com/AdAway/AdAway/issues)
add.2o7Net | 2o7Net tracking sites based on [hostsfile.org](http://www.hostsfile.org/hosts.html) content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.2o7Net/hosts) | occasionally | GPLv3+ | [issues](https://github.com/FadeMind/hosts.extras/issues)
add.Dead | Dead sites based on [hostsfile.org](http://www.hostsfile.org/hosts.html) content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Dead/hosts) | occasionally | GPLv3+ | [issues](https://github.com/FadeMind/hosts.extras/issues)
add.Risk | Risk content sites based on [hostsfile.org](http://www.hostsfile.org/hosts.html) content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Risk/hosts) | occasionally | GPLv3+ | [issues](https://github.com/FadeMind/hosts.extras/issues)
add.Spam | Spam sites based on [hostsfile.org](http://www.hostsfile.org/hosts.html) content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Spam/hosts) | occasionally | GPLv3+ | [issues](https://github.com/FadeMind/hosts.extras/issues)
Mitchell Krog's - Badd Boyz Hosts | Sketchy domains and Bad Referrers from my Nginx and Apache Bad Bot and Spam Referrer Blockers |[link](https://github.com/mitchellkrogza/Badd-Boyz-Hosts) | [raw](https://raw.githubusercontent.com/mitchellkrogza/Badd-Boyz-Hosts/master/hosts) | weekly | MIT | [issues](https://github.com/mitchellkrogza/Badd-Boyz-Hosts/issues)
hostsVN | Hosts block ads of Vietnamese |[link](https://github.com/bigdargon/hostsVN) | [raw](https://raw.githubusercontent.com/bigdargon/hostsVN/master/option/hosts-VN) | occasionally | MIT | [issues](https://github.com/bigdargon/hostsVN/issues)
KADhosts | Fraud/adware/scam websites. |[link](https://kadantiscam.netlify.com) | [raw](https://raw.githubusercontent.com/PolishFiltersTeam/KADhosts/master/KADhosts_without_controversies.txt) | frequently | CC BY-SA 4.0 | [issues](https://github.com/PolishFiltersTeam/KADhosts/issues)
Malware Domain List | Malware Domain List is a non-commercial community project. |[link](https://www.malwaredomainlist.com/) | [raw](https://www.malwaredomainlist.com/hostslist/hosts.txt) | weekly | 'can be used for free by anyone' | [issues](https://www.malwaredomainlist.com/contact.php)
MVPS hosts file | The purpose of this site is to provide the user with a high quality custom HOSTS file. |[link](https://winhelp2002.mvps.org/) | [raw](https://winhelp2002.mvps.org/hosts.txt) | monthly | CC BY-NC-SA 4.0 | [issues](mailto:winhelp2002@gmail.com)
Dan Pollock – [someonewhocares](https://someonewhocares.org) | How to make the internet not suck (as much). |[link](https://someonewhocares.org/hosts/) | [raw](https://someonewhocares.org/hosts/zero/hosts) | frequently | non-commercial with attribution | [issues](hosts@someonewhocares.org)
Steven Black's ad-hoc list | Additional sketch domains as I come across them. |[link](https://github.com/StevenBlack/hosts/blob/master/data/StevenBlack/hosts) | [raw](https://raw.githubusercontent.com/StevenBlack/hosts/master/data/StevenBlack/hosts) | occasionally | MIT | [issues](https://github.com/StevenBlack/hosts/issues)
Tiuxo hostlist - ads | Categorized hosts files for DNS based content blocking |[link](https://github.com/tiuxo/hosts) | [raw](https://raw.githubusercontent.com/tiuxo/hosts/master/ads) | occasional | CC BY 4.0 | [issues](https://github.com/tiuxo/hosts/issues)
UncheckyAds | Windows installers ads sources sites based on https://unchecky.com/ content. |[link](https://github.com/FadeMind/hosts.extras) | [raw](https://raw.githubusercontent.com/FadeMind/hosts.extras/master/UncheckyAds/hosts) | occasionally | | [issues](https://github.com/FadeMind/hosts.extras/issues)
yoyo.org | Blocking with ad server and tracking server hostnames. |[link](https://pgl.yoyo.org/adservers/) | [raw](https://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&mimetype=plaintext&useip=0.0.0.0) | frequently | | [issues](mailto:pgl@yoyo.org)
@XhmikosR please explain the problem here. I don'y understand. That diff is perfectly normal. What do you expect? All hosts files, all readme files, get re-generated. This is 100% by design.
The order of sources listed is not determinate; it never was. WTF cares? I certainly don't. readmeData.json
is just a JSON structure and everything comes from that.
When you push a patch which updates the data, not every line changes. On Windows all lines change, but not because of line endings, but because of the order the folders are traversed and thus the data are processed/output. You can see this in the Readme part I pasted above.
I'm not saying it matters, it just doesn't make any sense, though.
Look, about Windows... I don't mean to be unkind in any way, but people who use Windows have 99 other problems.
This repo is meant to be a sysadmin thing. It makes hosts files. Honestly, I don't care what diffs Windows users get as long as the hosts files generate properly.
You know what curation is, in practice? Curation means, saying "no".
I don't care about this.
@funilrys it seems the README Unicode issue is back (or was never fixed completely) 🙁
For example:
-**Windows XP**: Start → Run → `cmd`
+**Windows XP**: Start → Run → `cmd`
-* [ViHoMa](https://github.com/cmabad/ViHoMa) is a Visual Hosts file Manager, written in Java, by Christian Martínez. Check it out!
+* [ViHoMa](https://github.com/cmabad/ViHoMa) is a Visual Hosts file Manager, written in Java, by Christian MartÃnez. Check it out!
-* [Blocking ads and malwares with unbound](https://deadc0de.re/articles/unbound-blocking-ads.html "Blocking ads and malwares with unbound") – [Unbound](https://www.unbound.net/ "Unbound is a validating, recursive, and caching DNS resolver.") is a validating, recursive, and caching DNS resolver.
+* [Blocking ads and malwares with unbound](https://deadc0de.re/articles/unbound-blocking-ads.html "Blocking ads and malwares with unbound") – [Unbound](https://www.unbound.net/ "Unbound is a validating, recursive, and caching DNS resolver.") is a validating, recursive, and caching DNS resolver.
Honesty, Windows is such a shitshow. I know this doesn't help this issue; sometimes I just need to vent. @XhmikosR @funilrys
There's only one last issue on Windows after #1296 is merged.
readmeData.json still contains 2 backslashes at the end of the location
strings, for example:
{
"fakenews-gambling-porn": {
"location": "alternates/fakenews-gambling-porn\\",
}
}
I tried to fix it without success so far. Maybe @funilrys you have some idea.
That being said, finally everything is the same on Windows after #1296. 🙂
After #1157 lands, one of the issues I face on Windows will be fixed. These leaves us with two more issues I've noticed so far:
Christian Martínez
becomesChristian MartÃnez