Closed 130s closed 6 years ago
As @tfoote mentioned before, we can only manually revert/delete 1-by-1? Ok I'll start working on some..
Spammers are viciously changing titles of pages, that may not be captured by looking at revision history? E.g. Now http://wiki.ros.org/ROS/Tutorials is gone. And instead this page is there.
Spammers are viciously changing titles of pages, that may not be captured by looking at revision history? E.g. Now http://wiki.ros.org/ROS/Tutorials is gone. And instead this page is there.
I'm failing to revert the page name.
Ouch. I see multiple pages per second.
@tfoote: is there some admin interface for deleting all contributions from certain users on the wiki?
It might make sense if an admin could make the wiki read-only for a little bit. There are pages that have been reverted and then re-spammed.
I just changed the captcha to a random string. All non-whitelisted users have to enter that string (which is unknown to anyone). Hopefully that will stop further edits. I changed it before but a human can always get the answer right which was what happened.
If you want to help cleaning up the spam but your username is not on the list please post it here and I am happy to add it.
Some newly created pages on recently-changed page seem not exist, but they still linger on the list. E.g. this page. This makes harder to spot the pages that need to be deleted.
I just wrote a script which moves pages with the substring "quickbooks" to a different location in the FS.
@dirk-thomas that seems working great.
Now ROS/Tutorials page is renamed as I mentioned https://github.com/ros-infrastructure/roswiki/issues/139#issuecomment-170606737 (and now I don't see pulldown menu so I can't try renaming it back).
I have to grep through the moved pages and find the few which had content before.
http://wiki.ros.org/ROS/Tutorials seems to be the only renamed page. I restored it. As far as I can see it should be clean again now. If you still see any spam content please comment here and feel free to clean it up.
Thanks for all your help!
We still have to set a viable captcha again... @tfoote Any idea?
Just for the record: I ran this script to move spam pages to a different location:
import os
import re
for name in os.listdir('.'):
if not re.search('quickbook', name, re.IGNORECASE):
continue
print(name)
os.rename(name, '../pages_deleted/' + name)
And then searched through these to find pages with real content:
import os
import re
for name in os.listdir('.'):
if not os.path.exists(name + '/revisions/00000002'):
continue
with open(name + '/revisions/00000001', 'r') as h:
content = h.read()
if re.search('quickbook', content, re.IGNORECASE):
continue
print(name)
Good to see it's fixed!
Also just for the record I made a selenium script that giving it a list of links or articles (well, writting them inside of the script, sorry) to delete it will access the delete page and click on delete (adding a comment "spam").
https://gist.github.com/awesomebytes/7dcc241eecc322bb7dcd
Maybe it can inspire some other tool in the future.
It seems like we're up against a real person since they were able to break 3 updated textchas in a row quickly.
We probably need to switch to use an external authentication source. The wiki does not have good user controls that can function at our scale.
Hi @tfoote, my username on the wiki.ros.org is the same as my github username. Thanks for responding so quickly.
@moriarty you should be able to edit now.
Looks like I'm a little late to the party, but let me know if there's anything I can do.
My user is HunterAllen
A quick Google search shows that a large number of other wikis are experiencing the same troubles.
Within the past few minutes the same type of pages are showing up on the @jenkinsci wiki https://wiki.jenkins-ci.org/pages/listpages-dirview.action?key=JENKINS . They're coming from a username which was spamming GitHub issues a few months ago and has since been shut down.
Is there any more data/information to take a more direct, grey hat approach?
@tfoote, I just tried again to update the page, I didn't realise think that MoinMoin would be CaseSensitive. My username is actually Moriarty - sorry.
As for your comment on the external auth source, I'm not familiar with MoinMoin but I took a look into the options available for it. In the mean time it may be worth adding "quickbook" to the BadContent page, and perhaps "1800". https://moinmo.in/HelpOnSpam#BadContent_.2F_LocalBadContent
Could we just spam filter anything containing "Quickbook pro tech support phone number"? Looking at the Jenkins wiki link above and the posts in our wiki, that seems to be the intersection between all the posts, since the phone numbers and names are inconsistent.
@allenh1 I've added you. If you can propose some additions to the BadContent regexs that would match we could add them to the site. Though reviewing the titles they're really good at misspelling things or adding extra punctuation such as "s.u.p.p.o.r.t" And I do see some non-quickbooks things too.
The current regex is at http://wiki.ros.org/BadContent
Could I be added as well? wiki username: GvdHoorn. Thanks.
@gavanderhoorn added
Is there anything that I can do? My username is AkifHacinecipoglu
After trying to submit all changes to my package wiki entry, I recognized the whitelist thing ;-) Could you please add me to the list? Username: ChristophRoesmann Thank you very much!
@tfoote What about a [.,_]?
? That would check for commas, underscores and periods. We put that between all letters of the word. That would at least lessen the amount of spam... Also, I should note that this was a suggestion from @BPHays
@akifh @croesmann added
@allenh1 yes, the trick is to write the regex to not have too many false positives or negatives. If you can propose some full regexs we can review them.
Thanks for help cleaning up the spam. We have turned on recaptcha for the site. It should be back and operational now.
The diff is here: https://gist.github.com/tfoote/675b98df53369e199dea based on https://codereview.appspot.com/70400043 It is using https://pypi.python.org/pypi/recaptcha-client/1.0.6
If you see any issues please comment back here.
The spammers are back and are beating the recaptcha apparently?
@monusharma I don't find a user 'Monu Sharma' what's your username on the wiki?
@tfoote it seems like so, though it does not make sense. Maybe we are facing with real spammers in person. We may try to blacklist words like "quickbooks", phone number syntax etc. for a period. Or put them on moderator approval if MoinMoin has such feature.
Requesting access. My wiki username is KevinHallenbeck.
Added KevinHallenbeck.
@akifh we've added some blacklisting
I would like to be able to edit the wiki as well, my user is JavierVGomez thanks!
Request to edit the wiki. User name is AlexGoins.
Request to edit the wiki. User name is mallasrikanth
@jvgomez @akgoins @srikanthmalla You should have access now.
Hi! I am trying to upload my ROS package, ardrone2islab. I am at the step of creating wiki page on ROS.org. I can not open a new account because I do not what to fill in TextCha box. Please let me know what should I do next! Thanks
Request to edit the wiki. User name is Sunhine
@Suhine Added @tn0432 I'm sorry at the moment I don't have a good workaround for you for creating a new user. We're trying to find a better user management solution.
@tfoote Hello! So do you know when I am able to make my ROS wiki? Thx
Request to edit the wiki please, username jcerruti
Request to edit the wiki. User name is MathieuLabbe
@adamantivm @matlabbe added.
@tfoote Heya Tully, could I get access to create wiki pages? IanMcMahon is my username :) Thanks!
@rethink-imcmahon added
I'd like to edit the wiki. Could you please add me too? My user is BenceMagyar.
@bmagyar added
Update: 2018-10-17
This issue has maxed out the number of comment that can be added to a github issue. We have created a new issue #258 where you can request to be added for access to the wiki.
If you would like access to edit the wiki please comment below with your ROS wiki username.
After you have been added you may want to unsubscribe from this ticket as there will be many updates. Use the button on the right sidebar to unsubscribe from future notifications.
UPDATE 6/7/2016; Title of the ticket originally was
ROS wiki is experiencing vast amount of spams
, for the record. I'm glad that this ticket serves the purpose in many ways.http://wiki.ros.org/RecentChanges
There's easily more than 100 editions and is beyond manually revert-able.