collinbarrett / FilterLists

:shield: The independent, comprehensive directory of filter and host lists for advertisements, trackers, malware, and annoyances.
https://filterlists.com
MIT License
1.36k stars 117 forks source link

automated mirror links #598

Open DandelionSprout opened 6 years ago

DandelionSprout commented 6 years ago

Given some threads that I participated in last week (Especially https://github.com/NanoAdblocker/NanoCore/issues/220), I have become aware of GitCDN and Githack. My knowledge of those websites are not intimate, but I understand it as such that they're used to fetch raw files from GitHub repos upon request (and in the case of Githack, it'd fetch GitLab and Bitbucket files as well).

Whereas the technical aspects and gains by using them seem to be pretty small to me, they can under certain circumstances be used to bypass ISP censorships, as well as to bypass over-reacting network admins at schools and workplaces.

https://greasyfork.org/scripts/373361-github-gitcdn-button taught me that in the case of GitCDN at least, the URL conversion is a very easy one, with e.g. https://raw.githubusercontent.com/DandelionSprout/adfilt/master/NorwegianList.txt being converted into https://gitcdn.xyz/repo/DandelionSprout/adfilt/master/NorwegianList.txt. I haven't tested with Githack just yet, but I presume that it should be much the same thing.

I imagine the implementation of this on FilterLists.com as such that some extra buttons could be added to the Subscribe and View tabs(?), so that they'd be ordered as e.g. View 1 - GitCDN - Githack - View 2 - View 3.

And seeing that GitCDN and GitHack would be applicable to more than 500 lists, it'd be easier to do it automatically on FilterLists.com's server side than to add them all to Filterlists.json manually.

KonoromiHimaries commented 6 years ago

staticaly he look good https://www.staticaly.com/

collinbarrett commented 6 years ago

I think we can implement a data-driven way to surface all GitHub mirrors. List of mirrors to support below, let me know if there are more we should add. I'll begin looking into this.

List of Git mirrors to support:

KonoromiHimaries commented 6 years ago

maybe jsdelivr https://github.com/NanoAdblocker/NanoCore/issues/220#issuecomment-430668167

Atavic commented 5 years ago

I did a search for common lists like Fanboy and the CDN list provided by collinbarrett above.

I found some Fanboy lists on RawGit, reached its homepage that says:

RawGit has reached the end of its useful life

Same page offers the following alternatives:

jsDelivr
GitHub Pages
[CodeSandbox](https://codesandbox.io/)
[unpkg](https://unpkg.com/)
indolering commented 5 years ago

jsDelivr has mirrors in mainland China, which is essential since China basically introduces latency to all network requests outside of China. Cloudflare and others may have fixed their issues with Chinese connection points, but according to their website, "jsDelivr is the only public CDN with a valid ICP license issued by the Chinese government".

AFAIK, GitHack and GitCDN are pet projects; GitHack specifically states not to use it if 100% uptime is needed. Statically looks more professional and is operated by a company, but has only been around since October.

TBF, I'm biased to jsDelivr because I helped orchestrate some interactions between them and RawGit a few years ago. They are awesome people!

indolering commented 5 years ago

Not sure if uBlock or others support SRI, but it would be good if SHA-256 hashes were used to protect these resources.

Atavic commented 5 years ago

Declined, see: https://github.com/gorhill/uBlock/issues/1743

ghost commented 5 years ago

@indolering & any reading this ~

Hello, as I'm sure you're aware, an Internet Content Provider license regime does one thing only: whatever the heck they want. Back to that, though. ICPs are mainland China based providers - mainland China based; an important point for various reasons, two of which are of import here. I.e., in order to obtain such a license, a company must be based in mainland China - not Hong Kong, due to lax regulations for foreign companies. Which leads me to this: geographical proximity helps ensure easier connections - whether from the mainland, Hong Kong, Korea, Singapore, etc. Now, but, one may ask if the provider isn't licensed to operate within mainland China, then how, exactly, do they even exist to provide content? Many U.S., E.U., etc, e.g., based providers still operate (provide services) within mainland China via numerous loopholes in the licensing structure: 'partnering' with Chinese firms is generally used. Furthermore, being deemed an ICP by the relevant provincial authorities doesn't guarantee 'license' to operate (provide services) in mainland China, no matter the company. Even jsDelivr actual CDN, Quantil, isn't even based in mainland China - instead they run cable from HK to the four ISPs on the mainland. Why? HK is an autonomous zone. Places like this exist the world over, with greater or lesser legal autonomy. So, since technically an ICP provides legal cover to operate in mainland China, it never guarantees any sort of right to do so. Never, for companies based in or out of mainland China. Anyone who's interested in that, the reasons why, etc.,, I ask that you research it as this post is already far too oversimplified. The issue of censorship is one I'll let be, since no matter where any of us live in the world, no matter what the government under which we live labels itself, and on and on, none of us receives unfiltered information.

indolering commented 5 years ago

@DNSCrypt-Lists I'm actually fairly ill-informed on the matter: I remember that jsDelivr's early killer feature was working well in China compared to cdnjs (which is operated by CloudFlare). The current marketing speak is a bit hazy and it looks like Cloudflare partnered with Baidu or something ... hence why I posted the raw marketing claim and put it in quotes đŸ˜….

heydojo commented 4 years ago

Did somebody say ICP?

Anyway, I think just fixing the viewUrl, viewUrlMirror1 and viewUrlMirror2 links in https://github.com/collinbarrett/FilterLists/blob/master/data/FilterList.json and adding viewUrlMirror3 for some entries (so that more CDN mirrors can be added) is the way to go.

Importantly, I would like to add that all raw.githubusercontent.com links need to go. Github isn't a CDN. There are CDNs which are faster and more suitable for the purpose of delivering filter lists. They absolutely should be used instead.

Choosing a primary CDN to replace the raw.githubusercontent.com links probably needs to happen. I recommend jsDelivr for that.

List of Git mirrors to support:

* [ ]  [GitCDN](https://gitcdn.xyz/)

* [ ]  [GitHack](https://raw.githack.com/)

* [ ]  [Statically](https://www.staticaly.com/)

* [ ]  [jsDelivr](https://www.jsdelivr.com/)

Not sure about staticaly but the rest appear good with my limited interactions thus far.

From what I can tell, an alternative to my suggestion is to add github user name, github repo name and file path to FilterList.json and then automatically create the cdn links that way.

Voltairine-de-Cleyre commented 4 years ago

I think we can implement a data-driven way to surface all GitHub mirrors. List of mirrors to support below, let me know if there are more we should add. I'll begin looking into this.

List of Git mirrors to support:

https://statically.io/ is the service & they use Fastly. Links are easy there too & it's the service (for now) I'm using for my 3rd-party Urlhaus Filters Mirror links. They are mostly here (I don't include Bind & need to add a couple online-only lists but you'll get the idea from https://iosprivacy.com/mirror - also, Curben uses my direct mirror links via GitLab to display on the main repo due to CDNs lagging behind too far. The ones that reflect instant changes don't give the cache benefits for software such as uBlock Origin & generally lead directly back to the initial list regardless. Curben's main repo is here: https://gitlab.com/curben/urlhaus-filter & on it are his links to my GitLab mirror. So just wanted to make that small correction for the Statically URL & show some examples of it & other services (as seen on either Curben's or my repo).