speedydeletion / wikiproc

processing tools for the wiki
0 stars 0 forks source link

Graph timeline of deletion #4

Closed h4ck3rm1k3 closed 6 years ago

h4ck3rm1k3 commented 6 years ago

Based on the edit log we want to track the lifetime of articles. For each article that gets deleted we want to know the amount time that it lived. Basically we would like to see when it was created, or how many times it was recreated and if it was blocked.

leucosticte commented 6 years ago

When you say "blocked" do you mean protected from recreation?

With some articles, it might not be possible to get this data. For example, if there's an article that was deleted thrice and is still deleted, we won't know when the re-creations happened, because that data is in the archive table, which isn't open to the public.

What form do you want this data in, by the way? XML? Database table? Something else?

h4ck3rm1k3 commented 6 years ago

How is this, (Article, Creation Time, Deletion Time) in a csv file. Basically we want to know if the articles we are interested in are short or long lived, the database dumps can be used to extract long lived articles and I would assume that the better articles are longer lived.

leucosticte commented 6 years ago

A common scenario would be that a decently-written article on a non-notable topic would stay around for maybe a week while it's going through the proposed deletion or AfD processes. An example would the Michael Anissimov article. https://en.wikipedia.org/w/index.php?title=Special:Log/delete&page=Michael_Anissimov

These days, a lot of times when a sysop doesn't feel like nominating an article for AfD, he'll just misuse one of the speedy deletion criteria to get rid of it. That way, he doesn't have to write up a deletion justification; he can just select a boilerplate explanation from a drop-down menu and be done. In a small minority of cases, a user will object to this and get the article undeleted, and then it'll go to AfD. But because that happens so seldom, and because there's not really any punishment for misusing the speedy deletion criteria, there's a lot of incentive to just do a unilateral deletion.

One user, JzG, finally irritated people enough that he had to go away for awhile, but then he came back and now, from what I hear, is doing the same stuff as before. He's a crusader against "paid advocacy" or whatever, so he speedily deleted, for example, this article as unambiguous advertising: https://en.wikipedia.org/w/index.php?title=Special:Log&page=Guarantee+Security+Life+Insurance+Company

If an article lasts a week, though, it could still be missed by the dump, depending on the timing of when the dump was done.

h4ck3rm1k3 commented 6 years ago

Ok, so lets put this on hold and look at the real time monitoring. I will close this for now.