ccd0 / 4chan-x

Adds various features to anonymous imageboards.
https://www.4chan-x.net/
Other
952 stars 131 forks source link

Feature request ($500 donation offered) #3232

Open nobody98765 opened 2 years ago

nobody98765 commented 2 years ago

A system whereby an MD5 and General filter list can be downloaded from a public site on a periodicity defined by the user.

Rather than just relying on local list, provide an option to link to a curated list at a URL in a similar way as is done with the lists for adblockers.

There would be two lists, each with its own URL and periodicity: one for the image MD5s and another for the filter list with the reg expressions. Maybe a URL can be specified for each type of filters. MD5 would be a start.

For example a link to a pastebin URL with raw text containing MD5s can be put in and a periodicity of one minute so that every 1 minutes it is checked against the current version and if it's different it pulls down the new version and appends it to the current local list.

https://pastebin.com/raw/dnLdZZ58

Ideally a system of checking that the list has changed needs to be in place for example md5ing the whole list. Otherwise it would be wasteful pulling down the list every time even if it hasn't changed. If not, so be it, it's just text so wouldn't be much in terms of bandwidth even with thousands of users.

I know that some of this can be done in a very kludgey way with scripts pulling down or curling down the file and putting it in a directory, then reloading it. But I think a more elegant solution will be to building it into the 4chan X system.

4ChanX has helped me a lot over already so I'd like to donate to the dev team - $200 when starting and $300 on completion. Thoughts?

import-from-URL-refresh-frequency .

nobody98765 commented 2 years ago

Mentioning @ccd0 for visibility - happy to know if this is not a likely thing. Appreciate you taking the time to review.

nobody98765 commented 2 years ago

mentioning @nstepien @zixaphir @aeosynth @seaweedchan

aeosynth commented 2 years ago

is the linked pastebin updated regularly, are there multiple regularly updated lists?

nobody98765 commented 2 years ago

is the linked pastebin updated regularly, are there multiple regularly updated lists?

Multiple lists, but only one per "type" of filter. For example one list for MD5's hashes and one separate URL for "General" filters. Each could have its own update frequency at the front end, regardless of how often it gets updated at the URL. The user can enter their own URLs from where to pick up the filters, doesn't have to be an "official" list.

For example I've set up two examples to test.

In the examples above pastebin filters would be updated an average of about once a day over a typical week.

Regardless of whether the URL has a new list or not, the front-end (4chanX) could refresh the lists periodically according to a user-defined frequency, let's say every 10 mins, 1 hour, 12 hours, or 24 hours. Each list (MD5/General) could have a separate update frequency, or there could be only one setting for the frequency for all lists to be updated.

The user would still be able to have a LOCAL set of MD5 and General filters, in addition to the ones at the URLs they specify.

Thanks @aeosynth - I mean it about the donation - if it's not wanted/needed then pick a charity and I'll donate to that. 4chanX has made browsing /b/ usable again already.

aeosynth commented 2 years ago

i can do this, but will only start if @ccd0 responds positively (last commit jul 8 2021, has he abandoned?), if you can find another maintained fork that would accept this, or if you would be willing to maintain your own fork.

nobody98765 commented 2 years ago

i can do this, but will only start if @ccd0 responds positively (last commit jul 8 2021, has he abandoned?), if you can find another maintained fork that would accept this, or if you would be willing to maintain your own fork.

@ccd0 - what can i say or do to convince you? This would help a lot as it would allow shared filters lists between users, increasing the uptake and usefulness. Happy to donate to you and @aeosynth for the work.

nobody98765 commented 2 years ago

Another scenario is that I set up a fork and it's linked to from the 4chanx.net website pages as an alternative fork. Who is the owner/editor for those?

aeosynth commented 2 years ago

i assume @ccd0 owns that page. are you willing to create and maintain your fork if it's not linked?

nobody98765 commented 2 years ago

Github fork yes, but I'd really like it linked from the main page at the very least, so it's finable/usable by other people. Any idea if @ccd0 is AWOL or not interested in this? No drama either way.

Sent with ProtonMail Secure Email.

------- Original Message ------- On Monday, February 7th, 2022 at 10:37, James Campos @.***> wrote:

i assume @.***(https://github.com/ccd0) owns that page. are you willing to create and maintain your fork if it's not linked?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>

aeosynth commented 2 years ago

i have no idea what @ccd0 is up to. why don't you make the fork and we can continue discussion there.

nobody98765 commented 2 years ago

I have no idea how to fork not having used git for code. If you talk me through it I can do it, but otherwise I'd likely !!* it up and cause problems. Any other alternatives of me forking the repo, could you fork it and I'll take ownership somehow?

aeosynth commented 2 years ago

email or add me on discord, contact info on my profile

nobody98765 commented 2 years ago

The BBC spam yes, I've got a long list of reg-ex's and MD5's. Not much for the other spam.

Sent with ProtonMail Secure Email.

------- Original Message ------- On Saturday, February 19th, 2022 at 05:26, gir489 @.***> wrote:

@.***(https://github.com/nobody98765) I know this is a shot in the dark, but do you have any known regexs of posts that are spammed? The only one I know of so far and have countered are the gay discord chat spam. /^gay chat:[\S\s]+$/

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

alidan commented 2 years ago

in all honesty I don't think that this offers much of a use case, realistically you have what you want to block, on fast moving boards what you want to block would come up and be there to often to really make this viable, and on slow moving boards, outside of spam, it would be pointless to check up, any system would be fairly easily exploitable as well.

personally I have a dedicated button on my mouse to "open in new tab and hide" so I don't endlessly revisit the same thread over and over again. I believe the default to hide threads is shift left click though this may just work from catalogue. otherwise you can do what I do and set the board to show newest first and anything annoying will just fall further and further as time goes on.

aeosynth commented 2 years ago

honestly probably could be done if it was trained on the posts you hide

daa5933 commented 2 years ago

This is a losing battle unfortunately. Shills have started modifying image MD5's before posting. Recently I noticed that I will see the same image spammed several days in a row but when I search the MD5 in the archives it does not show up. Shills have figured out that it is more difficult to track and report their spam this way.

aeosynth commented 2 years ago

that was something i thought about implementing myself, for avatar posting :3

alidan commented 2 years ago

This is a losing battle unfortunately. Shills have started modifying image MD5's before posting. Recently I noticed that I will see the same image spammed several days in a row but when I search the MD5 in the archives it does not show up. Shills have figured out that it is more difficult to track and report their spam this way.

alot of the archives have perpetually broken search functionality, so take that into consideration as its more likely than people spending more effort.

aeosynth commented 2 years ago

This is a losing battle unfortunately. Shills have started modifying image MD5's before posting. Recently I noticed that I will see the same image spammed several days in a row but when I search the MD5 in the archives it does not show up. Shills have figured out that it is more difficult to track and report their spam this way.

the way to win this battle is to move beyond md5 style hashing and use perceptual hashing, like reverse image searchers do