blocklistproject / Lists

Primary Block Lists
The Unlicense
3.66k stars 333 forks source link

[Add request] 614 + 760 p*rn domains to be added to blocklist #688

Closed ghost closed 1 year ago

ghost commented 2 years ago

URL you wish to be added: see attachment

Why you believe this should be added: To improve the blocklistprojects p*rnlist

Add to list: All URL's in the two .txt files from the attachment.

Other info you think we should know: There are two lists: one with www and the other one without. The two lists have been sorted alphabetically and duplicates were removed. Trailing slashes and https(s) were removed too, so there are only clean URL's in the .txt files. All URL's have been tested on reachability after updating Pi-Hole blocklist. All URL's should be still reachable, so please block them as soon as possible. The fail.txt file contains 614 URL's and the fail_www.txt file 760. None of the sites in these two files have been gathered from other blocklists. fail.txt fail_www.txt

spirillen commented 2 years ago

Just a random pick from the last line... https://mypdns.org/my-privacy-dns/porn-records/-/issues/6242 zoozhamster.com

image

This is not a blame & claim it is a pointer to why we all might benefit from working together rather than, wee My list is bigger than yours, yes but mine have less FP, why not unite as one.

FP, I'm ripping of the list (unlicensed) and having our bot working on them.

Unfortunately I can't tell the bot to credit you for your work, but as it is given to a unlicensed project I'm not obligated to it.

But for all those our bot will rate as porn, I will thanks you for publishing and sharing your lists.

ghost commented 2 years ago

@spirillen Normally all sites should have p*rnograpic content. More severe than just adult-related things. Please tell me after the bot completed its tests and controls, how accurate my list was (should be very accurate). I will wait for your feedback. Thanks for having taken action so quickly.

spirillen commented 2 years ago

@spirillen Normally all sites should have p*rnograpic content. More severe than just adult-related things. Please tell me after the bot completed its tests and controls, how accurate my list was (should be very accurate). I will wait for your feedback. Thanks for having taken action so quickly.

Unfortunately I don't have such data, but maybe Dante (https://mypdns.org/dante) can give you some numbers. as he owns the bot

PS: haven't cross checked but I believe a have marked some of your domains as unconfirmed, That mean anything not a http code 200 or screenshot havent veen to any use

ghost commented 2 years ago

@spirillen That would be nice to get some feedback. I've checked all sites myself and I can confirm that every site has prnographic content or was made to guide people to sites or to 'directions' where such content is shown. This time I have not put any twitter or reddit sites in the files, since I know these cannot be used for the blocklist. That's why I hope you will put every single URL into the prnblocklist. I'll wait on the feedback and see what I can do better next time.

spirillen commented 2 years ago

@TruthfullEdward You did notice I do not have any power in this repository? and I copied your suggested lists into my own project?

In that case I'm getting confused for the comment

That's why I hope you will put every single URL into the prnblocklist

I shall also inform you that any further discussion regarding my project and my ripping of your lists should be taken at https://mypdns.org/MypDNS/support/-/wikis/ it is not fair in any way to do that in others platforms.

ghost commented 2 years ago

@spirillen Sorry for the confusion. I meant of course that I hoped that every single URL will be put in the blocklist of the blocklistproject. Whenever I would have said 'your blocklist' I meant of course the blocklistst of the blocklistproject. If I could contribute to your MypDNS, you can always ask. Happy to help.

spirillen commented 2 years ago

... If I could contribute to your MypDNS, you can always ask. Happy to help.

You can just sign up, and start using the add-on :smiley: But as said, we take that talk on mypdns, as there are other things moving like a moderator tool by Dante @cat or hit me up on https://matrix.to/#/@spirillen:anontier.nl

spirillen commented 2 years ago

Try this one guys...

curl -sSL -q 'https://raw.githubusercontent.com/blocklistproject/Lists/master/porn.txt' -o - | grep -vE '.*\.blogspot\.(ae|al|am|ba|bg|ca|be|ch|cl|co.at|co.il|co.id|co.uk|co.nz|co.ke|co.za|com.ar|com.au|com.br|com.by|com.co|com.cy|com.ee|com.eg|com.es|com.mt|com.ng|com.tr|cz|com.uy|de|fi|dk|fr|gr|hk|hr|hu|ie|in|is|jp|kr|it|li|lt|lu|md|mk|mx|my|nl|no|pe|qa|pt|ro|rs|ru|se|sg|si|sk|sn|tw|ug)$' | wc -l

500227

this is less faulty than your current

curl -sSL -q 'https://raw.githubusercontent.com/blocklistproject/Lists/master/porn.txt' -o - | wc -l

1908849

as it removes all the fucked dead google domain that redirects to the big fuck your privacy.com

ghost commented 2 years ago

@spirillen What do you mean f*** privacy? How come these top-level-domains are so different from the simple .com ? And why should this be changed in the blocklist?

spirillen commented 2 years ago

@spirillen What do you mean f*** privacy?

Blogspot.$tld is google right?

How come these top-level-domains are so different from the simple .com ? And why should this be changed in the blocklist?

I'm giving you a bit of homework here, that might be the best way to make you understand this.

You know what a HTTPS response code is? 2xx and 3xx

The exercise it now to find out what is happening when you try to access a blogspot.$tld that do not ends in com ie hotbutts.blogspot.co.id you can do this with curl

spirillen commented 2 years ago
curl -I hotbutts.blogspot.co.id -x socks5h://127.0.0.1:9050
HTTP/1.1 302 Moved Temporarily
Location: http://hotbutts.blogspot.com/

301 Moved Permanently This and all future requests should be directed to the given

302 Found (Previously "Moved temporarily") Tells the client to look at (browse to) another URL

As you can see we get a new location ending in .com this means every single non blogspot.com is 100% toxic waste

ghost commented 2 years ago

@spirillen So basically if we would block the .co.id page or put it in our blocklist, we would actually block the .com page, but the .co.id still remains. Or am I seeing this wrong. And, could those extended top-level domains be used to make pages that slip through filtering (not my intention but just a question, cause I've encountered multiple of when trying to find adresses for add requests).

spirillen commented 2 years ago

You can safely remove all on .blogspot.com as they all are redirecting ti the .blogspot.com

The only reason they do not replies with HTTP 302 is to add load to other search engines, like they could figure out to make a rule to ditch that.

but the .co.id still remains

and??? it is redirecting to .com so why waste the space and brake things with dumb toxic records? if a domain redirects with 301, 302, 307 or 308 you won't see anything in that domain only the target.

I would do sed -i '.*\.blogspot\.(ae|al|am|ba|bg|ca|be|ch|cl|co.at|co.il|co.id|co.uk|co.nz|co.ke|co.za|com.ar|com.au|com.br|com.by|com.co|com.cy|com.ee|com.eg|com.es|com.mt|com.ng|com.tr|cz|com.uy|de|fi|dk|fr|gr|hk|hr|hu|ie|in|is|jp|kr|it|li|lt|lu|md|mk|mx|my|nl|no|pe|qa|pt|ro|rs|ru|se|sg|si|sk|sn|tw|ug)$' porn.txt

Done: https://github.com/blocklistproject/Lists/pull/695

ghost commented 2 years ago

@spirillen And how come I can't put any URL's longer than sampleUrl.com in the blocking list? For example: https://www.reddit.com/r/Ryan/. Why can't such specific pages be blocked?

spirillen commented 2 years ago

@spirillen And how come I can't put any URL's longer than sampleUrl.com in the blocking list? For example: https://www.reddit.com/r/Ryan/. Why can't such specific pages be blocked?

???

What, where, when??? what is the topic in your question? If it is regarding the browser add-on you have to ask Dante here https://mypdns.org/infrastructure/mypdns-report

ghost commented 2 years ago

@spirillen No, in the first add request I sended, I added four of similar URL's. Apparently they couldn't be added. Schermafbeelding 2022-03-27 180525

spirillen commented 2 years ago

@spirillen No, in the first add request I sended, I added four of similar URL's. Apparently they couldn't be added. Schermafbeelding 2022-03-27 180525

Ahh ok, that's not me it is @thomasmerz you have to ask

thomasmerz commented 2 years ago

The screenshot should already explain WHY some "domains" couldn't be added.

ghost commented 2 years ago

@thomasmerz Well, yes host-name format can't be filtered. But I still do not get why URL's like https://www.reddit.com/r/Ryan/ can't be blocked by adding it to the blocklist. Cause access is denied for every URL in the list, so why can't these be in that list too?

thomasmerz commented 2 years ago

Pi-hole works only on DNS level, not on URL level. From the official Pi-hole documentation:

The Pi-hole® is a DNS sinkhole that protects your devices from unwanted content, without installing any client-side software.

thomasmerz commented 2 years ago

I also have no clue how to prevent my kids watching porn on Reddit without filtering Reddit (which also has some good reddits at all 🤷🏻‍♂️😱🥺

spirillen commented 2 years ago

I also have no clue how to prevent my kids watching porn on Reddit without filtering Reddit (which also has some good reddits at all 🤷🏻‍♂️screampleading_face

Squid or ublock origin have a few rules for squid here https://mypdns.org/my-external-stuff/ublock-origin-rules

ghost commented 2 years ago

@spirillen Did Dante already pass the URL's through his filter? Can you please share the results?

thomasmerz commented 2 years ago

Squid or ublock origin have a few rules for squid here https://mypdns.org/my-external-stuff/ublock-origin-rules

UBO is no solution because the kids already know how to change some settings or pause UBO.
The only thing that will work is a transparent proxy (squid) that can't be bypassed - but wait… I've seen them already using some VPN-browser-extensions 😞

I doubt that I really can hold them off without talking to them WHY they shouldn't consume this 🤷🏻‍♂️

spirillen commented 2 years ago

I doubt that I really can hold them off without talking to them WHY they shouldn't consume this

Exactly my point on the NSFW lists, learn your kids about sex and how it is good and the that the porn sites is a 4 position for the sake of recording the boring sex. Sex good, sex is necessary, be open about it, my kids god open an honest responsive to there questions since kindergarten. I have never had negative results to it... And how was it again when you became a teenager... Search: xxx click.. click.. splash splash??

spirillen commented 2 years ago

@spirillen Did Dante already pass the URL's through his filter? Can you please share the results?

Yes all have been ran through the filters. the result you need to either search through the API or ask Dante, I know you keep trying to make me ran mypdns support here, I'm not

ghost commented 2 years ago

@spirillen And is there an easier way than trying to contact Dante through Matrix?

ghost commented 2 years ago

@spirillen And is there an easier way than trying to contact Dante through Matrix? And do you have any idea when this list will be added since it has already been processed by the bot?

spirillen commented 2 years ago

@spirillen And is there an easier way than trying to contact Dante through Matrix?

nope

spirillen commented 2 years ago

And do you have any idea when this list will be added

No clue I have nothing to do with this project from harvest your commits as you "failed" to add them to my project :wink:

ghost commented 2 years ago

@spirillen Don't you use my list for your project as well? Cause I would allow you to do.

spirillen commented 2 years ago

@spirillen Don't you use my list for your project as well? Cause I would allow you to do.

Do you hold a list on your own as well?

ghost commented 2 years ago

@spirillen Umm, I make the lists myself, but don't make them on my Github. Just upload them as .txt. So dunno if that's what you meant. Oh and I can't get on the mypdns website for contact (so can't contact you or Dante). @spirillen But I was able to invite you on Nheko to my room.

ghost commented 2 years ago

@spirillen Please write me back as soon as possible via Nheko. Roomname where you were invited to : Github and mypdns: blocklist.

ghost commented 2 years ago

@blocklistproject Dear people of the blocklistproject, please ad this list to your blocklist asap! I wanna ad a new list soon, but can do that only after updating my Pi-Hole blocklist with these new URL's (blocklistproject p*rnlist). The list I gave you was sorted and should be ready to ad. Please close issue afterwards.

ghost commented 2 years ago

@spirillen Nheko is down cause my VM crashed. Please contact me here when your internet is back.

gap579137 commented 1 year ago

These have been added