StevenBlack / hosts

🔒 Consolidating and extending hosts files from several well-curated sources. Optionally pick extensions for porn, social media, and other categories.
MIT License
25.86k stars 2.15k forks source link

Add UT1 blocklists #2562

Open vitorsr opened 4 months ago

vitorsr commented 4 months ago

In line with consolidating reputable hosts files, consider adding (some) UT1 blocklists.

Useful sources to consider include the gambling list.

Others such as bank can be considered as an allowlist upstream.

Still others such as adult, as previously discussed in #830 (now with 4 million entries), are too numerous to be used as a primary source.

The lists are available at:

https://dsi.ut-capitole.fr/blacklists/

As an example, this is a GitHub mirror that NextDNS previously used for gambling blocklisting:

https://github.com/nextdns/metadata/blob/7b24bdcf774f1234f9cd5489e95bf054d51de780/parentalcontrol/categories/gambling.json#L4

welcome[bot] commented 4 months ago

Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better!

StevenBlack commented 4 months ago

Thank you for this Vitor @vitorsr.

This source offers interesting granularity.

What does everybody think?

vitorsr commented 4 months ago

Ideally smaller sources such as gambling could be directly incorporated (after sanitization and validation) into the gambling extension or into the @Sinfonietta upstream. This would match the NextDNS gambling filtering before they internalized filter metadata sources.

Others such as adult although numerous can still be useful given some aggregation logic. As an example, website ranking data sets such as Tranco [1] can be used to filter the top n most accessed (thereby alive for some duration of the ranking data collection).

[1] https://tranco-list.eu/