Tatoeba / tatoeba2

Tatoeba is a platform whose purpose is to create a collaborative and open dataset of sentences and their translations.
https://tatoeba.org
GNU Affero General Public License v3.0
714 stars 132 forks source link

Make it so Admins can remove lists by spammers #3134

Open ckjpn opened 4 months ago

ckjpn commented 4 months ago

Make it so Admins can delete empty lists by spammers, since often the list name itself is spam.

Perhaps this is not a priority to do, but it might help let people know that admins monitor incoming spam and take care of it, and help keep things clean and neat.

Notes

Example Screenshot

Screenshot 2024-07-28 at 8 36 46
jiru commented 3 months ago

Private lists are excluded from the weekly export, while unlisted lists are included.

jiru commented 2 months ago

If you are worried that an Admin might accidentally delete a "good" list, perhaps the lists could be "unlisted" or set to the current "private" setting. Maybe the "private" setting means that it's not even included in the exported data. I'm not sure.

I was about to say yes, better allow admins to "set private" rather than remove the list, but this creates another problem: once the list is set to private, even admins cannot see it any more, so there is no interface available for them to further interact with the list. So if an admin accidentally sets a list to "private", she cannot revert that action.

I think a reasonable solution is to allow admins to delete lists that are empty, @ckjpn what do you think?

ckjpn commented 2 months ago

I think a reasonable solution is to allow admins to delete lists that are empty, @ckjpn what do you think?

I agree.

Perhaps it would even be possible to have spammers' empty lists automatically deleted. Even if a member got accidentally set to spammer temporarily, losing an empty list wouldn't be a big loss.

ckjpn commented 2 months ago

Note that many of the lists at the top of "browse by list" (https://tatoeba.org/en/sentences_lists/index) are now such lists. All the ones in the screenshot with zero (0) entries are such lists.

Perhaps if it's not easy to make it so admins can remove these lists via a web interface, maybe someone with direct access to the database, could either delete these or set them to "private" which would hide them from people using the website.

This would (1) make the site look better to vistors and (2) would perhaps make it less likely that others would do the same thing. (reference: https://en.wikipedia.org/wiki/Broken_windows_theory)

https://tatoeba.org/en/sentences_lists/index

Screenshot 2024-09-25 at 16 17 17
cblanken commented 1 month ago

It probably makes sense to hide any empty lists on the public lists page anyway. That would be easy to implement and a good first step I think.

Perhaps it would even be possible to have spammers' empty lists automatically deleted. Even if a member got accidentally set to spammer temporarily, losing an empty list wouldn't be a big loss.

Is there any reason it wouldn't be safe to have a cleanup job delete any lists with zero sentences that haven't been updated in a year for example? Empty and idle lists are just a waste of space right? Even if they aren't from a spammer. That way the database is kept cleaner without as much manual intervention needed from the admins.

jiru commented 1 month ago

maybe someone with direct access to the database, could

For starters I unlisted all visible and empty lists created by suspended users. For reference the query I used was

update sentences_lists
set visibility = 'unlisted'
where user_id in (select id from users where role = 'spammer') and numberOfSentences = 0 and visibility in ('listed','public');

41 lists were affected.

jiru commented 1 month ago

@cblanken Can you clarify how does your suggestion of removing empty lists after a year help solving the problem of getting rid of spam lists?

cblanken commented 1 month ago

@cblanken Can you clarify how does your suggestion of removing empty lists after a year help solving the problem of getting rid of spam lists?

It wouldn't per se. Just reduce the maintenance burden on admins. Although, if there were only 41 lists anyway, it doesn't make much difference I suppose.