paulgb / BarbBlock

Chrome extension which blocks requests to sites which have used legal threats to remove themselves from other blacklists.
https://ssl.bblck.me
MIT License
639 stars 22 forks source link

Add other domains owned by admiral? #4

Open KeenRivals opened 7 years ago

KeenRivals commented 7 years ago

Many other domains were found that are owned by Admiral and point to the same IP as #1. There's a list at https://pgl.yoyo.org/adservers/admiral-domains.txt

anon182739 commented 7 years ago

I got it within a few dozen refreshes of the script at the time, probably within a 15-minute interval or so. Certainly nowhere near 100.

Wild guess: they were banning VPN exit nodes left and right and decided to stop it

If you outright blocked the Admiral servers, some Admiral-protected sites didn't serve any content; others still did. It depended on the bootstrapper that was used. If you outright blocked the Admiral servers, some Admiral-protected sites didn't serve any content; others still did. It depended on the bootstrapper that was used.

How many different bootstrapper scripts are there? What did you find out about them? What's the point in blocking the servers if that just kills the site too?

tofof commented 7 years ago

How many different bootstrapper scripts are there?

I have no idea. I know Admiral offers several different tiers/configurations of services, perhaps they're related to those, but that's just speculation.

What did you find out about them?

Nothing helpful that I can recall; the few notes I took during the poking I was doing mostly have to do with killing the inline script after it was loaded, rather than the bootstraper.

What's the point in blocking the servers if that just kills the site too?

That's what caused me to quit working on this at the time.

anon182739 commented 7 years ago

How do they inject the ads again? Could you just block the re-injected images or similar?

tofof commented 7 years ago

I don't think Admiral directly injects ads except when it's configured to serve `less-intrusive' ones to replace the ads that it's protecting. All of my efforts were toward the script that takes over and blanks the page when it detects an adblocker.

I'd like to show you an example, but right now I'm actually having trouble getting Admiral to function, i.e. to actually annoy me with a popup and blank the screen. I have a feeling there's a hack somewhere I'm forgetting to disable.

anon182739 commented 7 years ago

The scripts have some kind of optimization layer, do you know what it is? Pagespeed supposedly only removes whitespace and the usual, but it's a pagespeed URL. https://js.intercomcdn.com/frame.3db8f7f5.js !function(e){function t(r){if(n[r])return n[r].exports;var o=n[r]={exports:{},id:r,loaded:!1};return e[r].call(o.exports,o,o.exports,t),o.loaded=!0,o.exports}var n={};return t.m=e,t.c=n,t.p="https://js.intercomcdn.com/",t(0)}(function(e){ https://js.intercomcdn.com/shim.cada1405.js !function(n){function t(o){if(e[o])return e[o].exports;var r=e[o]={exports:{},id:o,loaded:!1};return n[o].call(r.exports,r,r.exports,t),r.loaded=!0,r.exports}var e={};return t.m=n,t.c=e,t.p="https://js.intercomcdn.com/",t(0)}({0:function(n,t,e){ They begin in the same way.

tofof commented 7 years ago

No, I don't know what it is, but working on reversing minified code is honestly way out of my specialty.

anon182739 commented 7 years ago

https://wordpress.org/plugins/admiral-adblock-suite/
public static function doesURIContainRandomFilename($uri, $seeds)
Does what it says on the tin if you're interested.

anon182739 commented 7 years ago

It also has other interesting stuff, gives away some information about how it's structured internally. owlsr.us is used as gateway, it might have been specially registered for that purpose.

Wordpress Plugin
Easy install. No JS code required. Proxy requests to minimize risk of adblocker intervention.

Custom Integration
Proxy Requests through your own domain to minimize risk of adblocker intervention.

Also, there are lots of API endpoints we can use to identify domains.
http://staging.owlsr.us/js?p=asd
http://owlsr.us/record

tofof commented 7 years ago

@anon182739: thanks again for your script, it's very helpful. I've been playing with it for a bit now, and I notice a couple things.

Obvious statement: unfortunately, your script is limited by the threatcrowd data. Less-obvious: there are domains that are Admiral that threatcrowd completely lacks.

For example: https://otx.alienvault.com/indicator/hostname/2znp09oa.com shows profitrumour.com among the related domains, which is an Admiral domain.

~But unfortunately, threatcrowd's picture of that is a big zilch.~ Edit: I (foolishly) retyped the url and had missed the u, but it's still a big zilch: profitrumor

anon182739 commented 7 years ago

Yes, it's a shame. What services are there to get all domains with a given nameserver? Because they all share the same 4 nameservers.

anon182739 commented 7 years ago

https://otx.alienvault.com/otxapi/indicator/hostname/whois/2znp09oa.com Here, just parse this JSON and use as starting domains for the script, it doesn't matter that they're not in threatcrowd. (1 domain per line, no http before)

tofof commented 7 years ago

Yeah, just was looking at that myself, looks like it's possible to at least walk that api's space as well. I'll see about expanding it if you don't beat me to it; I'm not going to get to it before tomorrow evening at the earliest.

anon182739 commented 7 years ago

@tofof No need for walking, since you can do queries on DNS name servers. grep -P '"domain": "[^"]+' --only-matching | cut -c 12-

anon182739 commented 7 years ago

https://pastebin.com/CYvL1GyJ 20 more

anon182739 commented 7 years ago

I strongly believe these ones are the only ones that are active in the wild: https://pastebin.com/Bu2gFH9J I ran script that created 600 accounts and got the script URLs, these are the only ones in the list. The least common one (jadeitite.com) is present 8 times, the most common one (82o9v830.com) is present 20 times.

paulgb commented 7 years ago

On thing on the topic of unminifying code, etc.: In order for this project to be bulletproof should it end up in court, I can't include any links obtained that way. Since a judge is unlikely to be technically adept enough to understand nuance in this area, I'd rather keep a good distance from anything that could be made to sound like reverse engineering to someone non-technical.

The network-based approaches are defensible though, so as long as we stick to that route we'll be fine.

anon182739 commented 7 years ago

Clean room reverse engineering is legal. So by analogy, you should be able to share domains you got from observing the script's behavior, but not by reverse engineering them.

If you're worried about the legal aspects, keep in mind that the CFAA is very broad. Bulk registering accounts could be illegal in theory, depending on the definition of "authorization".
(a) Whoever—
(2) intentionally accesses a computer without authorization or exceeds authorized access, and thereby obtains—
(C) information from any protected computer;

paulgb commented 7 years ago

It's not just the CFAA that I want to steer clear from, it's also DMCA 1201 which was the basis of Admiral's takedown against EasyList.

In general I guess my stance is: given that there are many ways of obtaining the list, we should do it in the way that is least likely to be misunderstood by a judge.

anon182739 commented 7 years ago

Are you worried about takedowns or legal responsibility?
If the former, use a git provider based outside of the US (bitbucket is australian, launchpad is british, osdn is japanese, ow2 consortium is french, self-hosted gitlab onion is in the jurisdiction of anonymous proxy) If the latter, use a throwaway account and Tor

paulgb commented 7 years ago

But that misses the goal of the project entirely! I'm not trying to evade any laws, quite the opposite. I'm trying to show that Admiral's interpretation of the law is incorrect, that it wouldn't hold up in court, and that they know that.

anon182739 commented 7 years ago

If you're not violating any laws, avoiding takedowns is just a convenience thing. DMCA takedown notices are for copyright infringement, circumvention of technical protections as defined in 17 USC § 1201 isn't copyright infringement.

tofof commented 7 years ago

I agree with @paulgb that Admiral's interpretation of the DMCA - that an item in a list constitutes a 'copyright circumvention mechanism' - is incorrect, and should be challenged. I further agree that the DMCA takedown process is not an appropriate remedy should a circumvention mechanism actually exist - it's instead for direct infringmeent. I understood that to be the primary reason for the creation of this project: to invite such a takedown, challenge it, and disprove this legal theory.

Expanding the list to include related Admiral domains that appear to function identically to the DMCA-takedown'd one seems in line with that goal. Particularly for a list created by crawling publically accessible networks and observing the content that Admiral willingly serves up (its landing page image).

@anon182739 seems focused on the feasibility of maintaining such a list in a hostile environment; throwaways, tor, non-US providers all work toward that goal. That's a potentially valuable process too, but quite in contradiction with the stated goals of this project.

paulgb commented 7 years ago

I agree with that interpretation @anon182739 but Admiral's stance is that it does and that's how we got here.

@tofof exactly

tofof commented 7 years ago

On the topic of reversing minified code - as far as I understand, none of that has been done in generating any of the list to this point. The list and addendum that @anon182739 has linked to is built from walking the recorded observations on threatcrowd - starting from a known Admiral domain, examining what IPs it was hosted on and what other domains were hosted on those IPs, and recursing, then examining for each domain its current public home page to see if it identifies itself as part of the Admiral protection scheme.

Note that the content that Admiral domains serves (attached below) explicitly covers such a use: "if you arrived here on accident and are not looking for information about this domain, feel free to hit back in your browser or close the tab." In other words, when we purposefully visit the domain and are exactly looking for information about it, that image is meant to be our answer. It spells out what type of content ("Javascript, HTML, CSS, video and images") is served and for what nominal purpose ("to control access to copyrighted content ... and understand how visitors are accessing their copyrighted content") it does so. Whether this is an honest summary of Admiral's use of these domains is another matter.

My mentions of reversing minified code were with respect to observing the behavior of the Admiral scripts themselves, and in historical context where I was examining those scripts in support of an entirely different project (Reek's Anti-Adblock) - which I linked to in my first post here. Also note that minification is not for the purposes of obscuring or defeating an observer; if it was, you wouldn't leave function names like "hasDisabledAdBlocker" intact. Minification is simply to reduce the size of the file that needs to be transmitted, so that content loads faster.

Public-facing content Admiral willingly serves to all visitors of its domains: 0018aa644424254f733b14cc2d656b361bfbdb7085745ae38df94c2545f529626c1deeb3fd996f10b84c80c7

tofof commented 7 years ago

[moved to its own post for clarity and importance]

At any rate, no knowlege nor use of Admiral's scripts were involved in creating the list and addendum thus far. In fact, I would say that the algorithm that produced the list and addendum thus far is superior in that respect to the unknown provenence of the items from pull request #8. I would actually recommend that @paulgb replace that list with the ones derived here, and will shortly create a pull to do so that he can merge if he agrees.

paulgb commented 7 years ago

Yes, I would accept that PR. I suspect that most of the entries are duplicates anyway and it would be good to have the record of provenance.

On Aug 15, 2017 3:13 PM, "tofof" notifications@github.com wrote:

[moved to its own post for clarity and importance]

At any rate, no knowlege nor use of Admiral's scripts were involved in creating the list and addendum thus far. In fact, I would say that the algorithm that produced the list and addendum thus far is superior in that respect to the unknown provenence of the items from pull request #8 https://github.com/paulgb/BarbBlock/pull/8. I would actually recommend that @paulgb https://github.com/paulgb replace that list with the ones derived here, and will shortly create a pull to do so that he can merge if he agrees.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paulgb/BarbBlock/issues/4#issuecomment-322560938, or mute the thread https://github.com/notifications/unsubscribe-auth/AAC0XSmOaSAfZZ8nT5mJFFQicZ24ZrtTks5sYe3fgaJpZM4O1cvW .

anon182739 commented 7 years ago

@tofof No, the addendum (https://pastebin.com/CYvL1GyJ and https://pastebin.com/Bu2gFH9J) isn't from threatcrowd, it's from using a script to register accounts and get the script domain they use.

I ran script that created 600 accounts and got the script URLs, these are the only ones in the list. The least common one (jadeitite.com) is present 8 times, the most common one (82o9v830.com) is present 20 times.

anon182739 commented 7 years ago

@anon182739 seems focused on the feasibility of maintaining such a list in a hostile environment; throwaways, tor, non-US providers all work toward that goal. That's a potentially valuable process too, but quite in contradiction with the stated goals of this project.

That's true. I was linked here from elsewhere and didn't read the project goals. I'm mostly interested in the technical side of things and what actions give the end result of a more complete list.

anon182739 commented 7 years ago

Is there any other place where admiral is being discussed? This seems to be the only active github issue about it, are there any active threads elsewhere?
https://github.com/anon182739/admiraljs - 206 admiral JS files from different domains, might be useful. They're all identical save for the domain name referenced, so a regex should block them.

unicorntaco commented 6 years ago

It's trivial to observe hundreds of Admiral domains, they probably number in the thousands.

If someone actually wants to make a serious attempt,

Could not this higher level principle be applied:

This exceedingly eccentric blog suggests corralling the evil via ASN blocking. Guilt by association seems like a grand idea, but implementing it is not in my wheelhouse.

Nano Adblocker, a fork of Ublock Origin, seems to have some more userscripty type powers.

@tofof @paulgb @anon182739

TNW9imKLC3fv commented 5 years ago

This was released on 2018-11-01 updated day after: https://github.com/jkrejcha/AdmiraList - but hasn't been updated since then. This is currently being updated: https://github.com/jerryn70/GoodbyeAds/issues/6 - Though again it's another solution that relies on a manually-updated list of domains which the spam companies seem to be deliberately bypassing by registering a new domain every day - the ASN blocking might be worth looking into, or something that automatically queries the "private" WHOIS databases along the lines of Scihub

(Using NoScript, I saw Issu using a creepy-sounding domain "shallowsmile.com" and found this thread by doing a DuckDuckGo search for said domain with the quotation marks to force searches to find that exact phrase)

anon182739 commented 5 years ago

There is no need to block any ASNs. It is sufficient to register a few hundred accounts with fake information and see which domains are being offered.