Open vitorsr opened 10 months ago
Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better!
Thank you for this Vitor @vitorsr.
This source offers interesting granularity.
What does everybody think?
Ideally smaller sources such as gambling could be directly incorporated (after sanitization and validation) into the gambling extension or into the @Sinfonietta upstream. This would match the NextDNS gambling filtering before they internalized filter metadata sources.
Others such as adult although numerous can still be useful given some aggregation logic. As an example, website ranking data sets such as Tranco [1] can be used to filter the top n most accessed (thereby alive for some duration of the ranking data collection).
In line with consolidating reputable hosts files, consider adding (some) UT1 blocklists.
Useful sources to consider include the gambling list.
Others such as bank can be considered as an allowlist upstream.
Still others such as adult, as previously discussed in #830 (now with 4 million entries), are too numerous to be used as a primary source.
The lists are available at:
https://dsi.ut-capitole.fr/blacklists/
As an example, this is a GitHub mirror that NextDNS previously used for gambling blocklisting:
https://github.com/nextdns/metadata/blob/7b24bdcf774f1234f9cd5489e95bf054d51de780/parentalcontrol/categories/gambling.json#L4