uBlockOrigin / uAssets

Resources for uBlock Origin, uMatrix: static filter lists, ready-to-use rulesets, etc.
GNU General Public License v3.0
3.85k stars 732 forks source link

Net Scan: 4TB All, 40TB BAB&app_vars #4268

Closed jspenguin2017 closed 5 years ago

jspenguin2017 commented 5 years ago

20 packages scanned, 1 package processed, about 50,000+ packages expected per month. About 80 GB data scanned out of about 200+ TB expected per month.

Hard anti-adblock: ``` # BAB http://f4.motogon.ru/motokross/mxgp-2018/msg717172/ https://dc-chronicle.com/2018/08/01/robert-mueller-refers-democrat-tony-podesta-criminal-charges/ https://www.sofiotheque.info/2018/02/telecharger-50-dossiers-de-maladies.html # Other http://www.kizi.cm/ # Comes up every time ``` Annoyance: ``` # anOptions http://allamericansthings.com/2018/11/prototyping-the-betentacled-inflatable-soft-robots-of-zero-gee/ http://thejewishvoice.com/2016/03/16/intel-acquires-replay-technologies-to-up-its-sports-game/ https://usa.watchpro.com/millennials-reason-behind-luxury-spending-boost-china/ http://www.ebizlatam.com/tag/gerardo-coronel/ http://www.fagagnaonline.com/ https://2spendless.com/?tag=forever https://centrafriqueactu.com/2018/11/01/congo-la-gouvernance-forestiere-au-coeur-dun-forum-regional-a-brazzaville/ https://www.betglob.pl/tag/33 ```
jspenguin2017 commented 5 years ago

This scanner is just an initial prototype, I hacked it together within 30 minutes, I'll write proper automation scripts later.

okiehsch commented 5 years ago

I can't consistently reproduce the BAB entries, never had that issue.

okiehsch commented 5 years ago

ebizlatam.com is fixed in the regional list.

okiehsch commented 5 years ago

@jspenguin2017 let's use this issue as reference for your future scan results.

jspenguin2017 commented 5 years ago

OK, I'll close this for now, I'll reopen when new data become available.

The scanner currently flags BAB, anOptions, and app_vars, I definitely want to add more. Once I start spinning up clusters to boost the scanning speed, it'll emit thousands and thousands lines of result, I'm thinking about make the scanner emit filter rules directly, honestly I don't want to manually verify every result.

jspenguin2017 commented 5 years ago

1~5

Possible BlockAdBlock: https://audienciapixelada.com/videos/270
Possible BlockAdBlock: https://bestmobs.co/g-tide-e90/
Possible BlockAdBlock: http://bihar.result91.com/akupatna/mba-2nd-sem-exam-result-2018/88158
Possible BlockAdBlock: http://www.breakingworldnewstoday.com/2013/05/1568.html
Possible BlockAdBlock: https://business.dailynews.us.com/2018/05/we-still-don-understand-social-security.html

Please collapse (hide as outdated) comments that are handled.

jspenguin2017 commented 5 years ago

6~10

Possible BlockAdBlock: https://www.ccvgaming.com/forum/awards.php?s=a92de730df0b9b0dfae5bf3a3a8ea49a
Possible BlockAdBlock: https://www.centergeek.it/miui-si-aggiorna-alla-versione-2-3-16/
Possible BlockAdBlock: http://choone.com/?view=ads&catid=7&subcatid=113&cityid=372&lang=fa
Possible BlockAdBlock: http://freeclassifiedads.qtellads.com/index.php?view=post&postevent=&cityid=160&lang=en&catid=&subcatid=
Possible BlockAdBlock: https://infodifesa.it/corte-dei-conti-illegittima-ritenuta-sulla-pensione-a-generale-della-guardia-di-finanza/
jspenguin2017 commented 5 years ago

11~15

Possible BlockAdBlock: https://www.isolaillyon.it/2014/06/17/le-bizzarre-avventure-di-jojo-stardust-crusaders-parte-iii.html
Possible BlockAdBlock: https://library.avsim.net/esearch.php?DLID=160208&Name=&FileName=&Author=&CatID=
Possible BlockAdBlock: https://www.magazine-advertisements.com/braun-tassimo-coffee-maker.html
Possible BlockAdBlock: https://www.marmitedumonde.com/tag/set-de-couteaux-a-pizza
Possible BlockAdBlock: http://multicoinfaucet.cf/doge/?cc=reCaptcha
jspenguin2017 commented 5 years ago

16~20

Possible BlockAdBlock: http://netherlands.marcyads.com/0/posts/194-Sports-and-Fitness-/4234--Fitness-for-Sale/3436-buy-phentermine.html
Possible BlockAdBlock: https://passivenation.com/Thread-07-10-2018-socks-5-4--4394
Possible BlockAdBlock: https://pornslay.com/blowjobs-fotos/
Possible BlockAdBlock: http://www.queen68.gr/2016/12/blog-post_84.html
Possible BlockAdBlock: http://www.savemydinar.com/2016/11/cbk-kuwait-enjoy-10-discount-on-crumbs.html
jspenguin2017 commented 5 years ago

21~26 (last BAB batch from this scan)

Possible BlockAdBlock: https://sindibad.tn/petites-annonces/worgl/2760854
Possible BlockAdBlock: https://softwaregiveaways.net/2017/07/30/zortam-mp3-media-studio-pro-22-45/
Possible BlockAdBlock: https://www.updatesmarugujarat.in/2017/10/pgvcl-list-of-candidates-called-for.html
Possible BlockAdBlock: https://uvtattooideas.club/om-symbol-neck-tattoo/
Possible BlockAdBlock: http://webcamendirect.net/webcam/808-southaven-stateline-west-of-market-place.html

Possible BlockAdBlock: https://www.yellowpagesgoesgreen.org/Bethlehem-NH/Seafood
jspenguin2017 commented 5 years ago

These are from 20 packages, there's over 50000 left. As this pace, there'll be over 2500 times more... I need to think of a better strategy.

okiehsch commented 5 years ago

Well, atleast one site includes multiple anti-adblock scripts, I will fix such sites seperately and reference this issue.

http://multicoinfaucet.cf/doge/?cc=reCaptcha

jspenguin2017 commented 5 years ago
https://www.dearjulius.com/2018/11/3-nutrition-lessons-to-steal-from.html
https://mobil.igpure.com/tools
https://www.marketmovers.it/2009/10/provvedimento-finarte-casa-daste-spa.html
http://www.mrhowtosay.com/view/pol/rus/2008942

Popup?

http://paytoshi.in/?cc=reCaptcha
jspenguin2017 commented 5 years ago

I scanned 30 more packages, but only found 4 BAB, looks like my initial estimate was way off.

Also, I don't know whether the domains will repeat in later packages. I need to update my scanner to keep track of it. I also want my scanner to be able to handle AdFly, IL, and more.

On top of this, I need to find a way to drain the backlog, so I'm shutting down my server for now.

okiehsch commented 5 years ago

http://paytoshi.in/?cc=reCaptcha I can reproduce a popup, no anti-adblock message on my end.

jspenguin2017 commented 5 years ago

Yes, only popup. Closing for now, I need some time to think about the next step.

smed79 commented 5 years ago

no anti-adblock message on my end.

Go to http://paytoshi.in/?cc=reCaptcha Click "You are Human". The page will redirect you to http://payco.xyz/FButmCYnOojY ==> anti adblock

okiehsch commented 5 years ago

I worked through the first 100 lines of net-scan-backlog/anOptions0.txt and net-scan-backlog/anOptions1.txt.

2spendless.com
allamericansthings.com
bookwonderland.com
dailynexus.com
centrafriqueactu.com
ebizlatam.com
en-contact.com
homasg.com
kizi.cm
lentes-et-poux.fr
moonbunnycafe.com
salient.org.nz
socialbookmarkings.us
softmaker.kz
thejewishvoice.com
usa.watchpro.com
vodoleyworld.ru

are either already fixed, can't reproduce the anti-adblock messsage, can't connect or parked domains. currentaffairs.gktoday.in uses admiral.

I will add the rest.

jspenguin2017 commented 5 years ago

18 out of 200, that's not an acceptable ratio, I'll definitely need to find a way to automate it, at least partially.

jspenguin2017 commented 5 years ago

Annoyance

https://www.airsoft-milsim-news.com/sightmark-mini-shot-m-spec-fms-red-dot-sight/
https://anglais-pdf.com/exercice-grammaire-anglais-muchmany/
http://blog.j172.tw/%E7%B4%B3%E5%A3%AB%E7%9A%84%E5%BF%83%CB%99%E7%97%9E%E5%AD%90%E7%9A%84%E6%83%85/%E7%95%B6%E9%A1%98%E6%9C%9B%E5%AF%A6%E7%8F%BE%E4%BA%86/
https://chelorg.com/2018/11/05/tusk-has-arrived-to-warsaw-for-questioning-about-the-financial-pyramid/
jspenguin2017 commented 5 years ago

Annoyance

http://www.cogitoergo.it/junk-silver-quantita-totali-coniate-negli-usa-dalla-fine-del-%E2%80%99700-ad-oggi/
https://www.der-bank-blog.de/ratgeber/sparbrief-zinsen-erklaert-wer-kann-davon-profitieren/2442/?utm_source=bankblog
https://www.designtagebuch.de/t-online-portal-erneuert/
http://dialogos.ba/2018/05/25/ibn-taymiyyina-promisljanja-o-zekatu-placanju-u-naturi-ili-novcu-i-sl/

Hard Anti-adblock

https://currentaffairs.gktoday.in/union-govt-approves-a-project-national-optical-fibre-network-nofn-project-0420135910.html
jspenguin2017 commented 5 years ago

Annoyance

https://earmilk.com/2017/08/02/river-returns-with-kitsune-on-midnight-x/
http://www.ebizlatam.com/tag/paula-garces/
http://ecijabpeinfo.com/?tag=puente-diciembre
okiehsch commented 5 years ago

I already added currentaffairs.gktoday.in , like I said yesterday

currentaffairs.gktoday.in uses admiral

jspenguin2017 commented 5 years ago

Ops, looks like I forgot to update my filters...

okiehsch commented 5 years ago

https://github.com/uBlockOrigin/uAssets/issues/4268#issuecomment-444748012 ebizlatam is fixed in the regional list.

jspenguin2017 commented 5 years ago

OK, added to global whitelist, it won't show up again.

mapx- commented 5 years ago

removed duplicate https://github.com/uBlockOrigin/uAssets/commit/bb1b3e4d2194addb7ebe89ce50c65229f0917d61

okiehsch commented 5 years ago

I removed https://centrafriqueactu.com/2018/10/31/gouvernance-forestiere-le-premier-forum-du-cv4c-souvre-a-brazzaville from https://github.com/uBlockOrigin/uAssets/issues/4268#issuecomment-445389074, already fixed in annoyances.

jspenguin2017 commented 5 years ago

Eh... My automation scripts still need improvements... I'll have it load annoyance filter as well.

okiehsch commented 5 years ago

der-bank-blog.de has to go in uBO-filters the message reappears after every pageload.

okiehsch commented 5 years ago

Same with dialogos.ba.

jspenguin2017 commented 5 years ago

Hard (NSFW)

http://adultscribe.com/live-cams/camsoda/the%20horny%20hostel%20-%20bedroom

Hard

https://www.amni8.com/2018/06/samsung-galaxy-j3-2018.html
https://www.baby-magnum.review/kokoro-connect-review-summer-2012-alur-cerita-yang-keren/
https://bestsellerforaday.com/black-wedding-dresses-meaning/
# Comes up every time
https://br.nacaodamusica.com/posts/dashboard-confessional-we-fight/
jspenguin2017 commented 5 years ago

Hard + Popup + NSFW

http://gaypornwave.com/dmitry-osten-andro-maas/

Hard

https://homedesignnearme.com/25148/small-apartment-interior-design-mumbai/
https://jalasenastri.com/tag/ngipri-atau-ngepet/
# Comes up every time, may need 1~2 refreshes
http://www.hagalil.com/2015/03/muenchen-tipps-2/
# Comes up every time
https://komionline.ru/news/21328
jspenguin2017 commented 5 years ago

Hard

http://lemontreetours.com/chaise-haute-bar-pas-cher/
https://luxurydreamhomes.net/tag/acrylic-vs-enameled-steel-bathtubs/
http://majorpress.net/be-well-f-i-t-flow-restorative-yoga/4/
http://meraswasthyameriaawaz.org/how-to-make-a-wonderful-birthday-card/how-to-make-a-wonderful-birthday-card-new-lovely-birthday-card-collage-tellmeladwp/
http://miltonfriedmancores.org/college-resume-essay-examples/college-resume-essay-examples-15-fresh-resume-portfolio-website-template-for-resume-best-college-resume/
jspenguin2017 commented 5 years ago

Hard

https://minutolivre.com/globo-escolhe-miguel-falabella-para-comentar-o-oscar-2017/
https://www.novahomeimprovementsmd.com/tag/antique-white-kitchen-cabinets-for-sale/
# Comes up every time
https://www.onlineclassnotes.com/

Annoyance

http://mywrestling.com.pl/author/wojtekczentorycki/
https://ocen-piwo.pl/Casablanca_Premium_Beer%2C1%2C928
jspenguin2017 commented 5 years ago

Hard

http://www.piazzagallura.org/santa-teresa-gallura-ot-officineperegrine-teatro-presenta-il-progetto-i-racconti-del-cuscino/
https://rumahfit.com/taman-sempit-dengan-sayuran-di-dalam-pot/
http://tattoostyles.us/tag/tattoo-ideas-unusual
http://www.vivianeaudet.com/cr-z-%E4%B8%AD%E5%8F%A4/
# Comes up every time
http://www.orologidapolso.info/contatti/
jspenguin2017 commented 5 years ago

--- anOptions backlog ---

Hard

https://www.elquintobeatle.com/tag/su-nombre-real-es-otro/
https://www.en.magicgameworld.com/quake-champions-how-to-find-lore-scrolls/

Annoyance

https://www.essentialhomme.fr/star-wars-the-last-jedi-nouvelle-bande-annonce-devoilee/
https://www.euribor.com.es/2018/11/05/una-sala-de-28-magistrados-del-supremo-delibera-en-estos-momentos-quien-debe-pagar-el-impuesto-sobre-hipotecas/
https://gptimes.de/fallout-76-geschichten-aus-den-huegeln-von-west-virginia-die-begegnung-in-flatwoods/
jspenguin2017 commented 5 years ago

Annoyance

https://gripped.com/
https://healthyfitcare.com/tag/receding-gums-cure/
https://konserwatyzm.pl/niemieccy-lewacy-wybieraja-sie-do-warszawy-na-marsz-niepodleglosci/
https://www.leblogducinema.com/tag/alice-ferney/

Hard

https://www.ilprocidano.it/prestigiacomo-vivara/immagine-092/
jspenguin2017 commented 5 years ago

Annoyance

http://www.midcenturyhome.com/3-eichler-renovations-that-will-leave-you-speechless/
https://www.mrfreeat33.com/resources/
https://www.myandroidsolutions.com/category/solution/
https://pbn.com/titleist-doesnt-find-lewd-golf-parody-funny/

Hard

# Comes up every time
https://www.pacbiztimes.com/section/economy/east-ventura-county/page/22/
jspenguin2017 commented 5 years ago

Hard

# Comes up every time
https://www.pepakura.eu/tag/terriermon/

Annoyance

https://www.riprovaci.it/migliori-offerte-amazon-19022018/
http://salient.org.nz/2015/07/te-kunenga-mai-i-waiteika/
# Need to refresh once to trigger
https://www.pszone.fr/tag/just-dance-2015
# Need to refresh once to trigger
https://www.professionaljeweller.com/cornwall-jeweller-fleeced-of-stock-worth-40k-by-man-who-jumped-counter/
jspenguin2017 commented 5 years ago

Hard

https://sheetmusic-free.com/jingle-bells-flute-sheet-music-christmas/
https://thejoe.it/ru/2018/05/07/server-dlna-per-linux-quale-scegliere/

Annoyance

http://www.sevillaactualidad.com/tag/evasion/
# Need to refresh once to trigger
http://www.thehoya.com/concert-review-old-crow-medicine-show-anthem/
# Need to refresh once to trigger
https://www.timingpolitico.com.ar/francia-belgica-mundial-rusia-2018/
jspenguin2017 commented 5 years ago

Hard

# Need to refresh once to trigger
https://www.tuost.com/ezine/tag/soluzioni-quotdiane/

Annoyance

http://woodworkingbenchvisemadeinusa.us/furniture-building-plan/
# Need to refresh once to trigger
https://usa.watchpro.com/lifestyle-brand-skagen-introduces-newest-addition-smartwatch-portfolio/
jspenguin2017 commented 5 years ago

That's the last one! All backlog drained!

okiehsch commented 5 years ago

usa.watchpro.com is already part of annoyances. ocen-piwo.pl I get the annoying message even if I disable uBO.

jspenguin2017 commented 5 years ago

OK, looks like I somehow missed it. OK, pushed an entry to quick reports tracker for further observation.

jspenguin2017 commented 5 years ago

I now have fingerprinting rules for AdFly, Admiral, anOptions, app_vars, BAB, FAB, and IL. What else should I add?

Also closing for now, until more results become available.

okiehsch commented 5 years ago

AdFly is not really necessary, they all redirect through one domain, that domain changes from time to time, for example right now it's briskgram.net.

ay.gy/1mG1hI
adf.ly/yiRpG
activeation.com/AZrX
.....

These are the first links I found googling, so I have no clue where you will end up if you don't abort the redirection, I just tested that all point to briskgram.net.

jspenguin2017 commented 5 years ago

OK, thanks for the info, I updated my scanner.

Also I forgot to attach IAM role to my server again... I have the attention span of a goldfish...

It's me or T2 servers are a scam? It's worse than T3 in every single way and costs more, did I miss something? Also network for t2.micro is capped at like 30 MiB/s...

jspenguin2017 commented 5 years ago

Also I'm scanning pre-processed archives, so I don't need to worry about redirections.

okiehsch commented 5 years ago

I mentioned it for the benefit of everybody who reads this thread, I had some complaints about people following links and then getting malware alert messages. Maybe we should put a disclaimer "follow links at your own risk" in the readme. :wink: