Understanding validation and vandalism detection work on Wikipedia

NOTE: This is a work in progress. Posting here to start discussion around the topic

Wikimedia uses Artificial Intelligence for the following broad categories:

Vandalism detector. Use edit statistics to find correlations and predict if an edit is problematic.
Article edit recommender. Use user edit history to predict which articles could be edited by user.
Article quality prediction. To assess quality of articles on Wikipedia.

On Wikipedia there are 160k edits, 50k new articles and 1400 new editors everyday. The goal is to split the 160k edits into:

Probably OK, almost certainly not vandalism
Needs manual review, might possibly be vandalism

Themes for validation

Points of view or standpoint:
- Wikipedia is a firehose
- Bad edits must be reverted
- Minimize manual effort wasted on quality control work
- Socialize and train newcomers
Design tools for Empowerment vs a Power over.
- Empowerment: I want to hear you, so I'll make space for you to speak and listen.
- Power over: I want to set the tone of our conversation by talking first.
A flipped publication model: Publish first and review later.
Given enough eyeballs, all bugs are shallow. If we have a large enough group of people looking at something, somebody will know the tight way to solve the problem.

Welcoming newcomers

More newcomers is a major Wikimedia goal and new spaces have been developed to support newcomers. Quality control in Wikipedia is being designed with newcomer socialization in mind so that newcomers (especially those who don't conform) are not marginalized and good-faith newcomers are retained. Although anonymous edits on Wikipedia are twice as likely to be vandalism, 90% of anonymous edits are good.

From this Slate article:

Most people first get involved with Wikipedia—one of the largest social movements in history—by making some minor corrections or starting a small article that is missing. If their contributions get deleted, especially if there is no sufficient explanation why, they are likely to quit. It is quite destructive to the community’s long-term survival, as Wikipedia has struggled for quite a while with editor retention.

Popular validation tools

There are around 20 volunteer developed tools, 3 major Wikimedia product initiatives. Some popular ones are:

Objective Revision Evaluation Service (ORES) is intended to provide a generalized service to support quality control and curation work in all wikis.
- Edit quality models for predicting whether or not an edit cause damage, was saved in good-faith or will eventually be reverted.
- Article quality models that helps gauge progress and identify missed opportunities (popular articles that are low quality). Wikipedia 1.0 assessment
Huggle, a diff browser intended for dealing with vandalism and other un-constructive edits on Wikimedia projects.
STiki, a tool used to detect and revert vandalism or other un-constructive edits on Wikipedia, available to trusted users.
User:ClueBot NG, an anti-vandal bot that tries to detect and revert vandalism quickly and automatically. A 0.1% false-positive rate and able to detect 40% of all vandalism.

Basic web interface for ORES at https://ores.wikimedia.org/ui Some of the features used to aid classification of a revision as problematic or not are: Is user anonymous, number of characters/words added, modified and removed, number of repeated characters and bad words added. Prediction scores for a problematic revision look like below:

https://ores.wmflabs.org/scores/enwiki/damaging/642215410

{
  "642215410": {
    "prediction": true,
    "probability": {
      "false": 0.11271979528262599,
      "true": 0.887280204717374
    }
  }
}

There has been quite a lot of research in this field evident from the number of results on Google scholar about Wikipedia vandalism detection.

Hyperlinks

Reading

Videos

Engineering at the Intersection of Productive Efficiency, Ideology, and Ethical AI in Wikipedia

cc: OpenStreetMap Community

Awesome research into Vandalism detection on Wikipedia @bkowshik . The wiki community have a mature bot policy and encourage focused and effective mechanical editing that has built community curated ecosystem of AI workers that has been highly effective in quickly fixing the most common problems to occur. A large academic community is interested in the mechanics of this and the associated research has further helped to strengthen the defenses

To compare, the OSM Automated Edits Policy has not evolved much. Validation is a good angle to have some bots running to catch simple issues like a invalid capitalization in a tag like Highway=residential

You can check a model's statistics by dropping the revision ID from the path.

https://ores.wmflabs.org/scores/enwiki/reverted/

{
"params": {
    "balanced_sample": false,
    "balanced_sample_weight": true,
    "center": true,
    "init": null,
    "learning_rate": 0.01,
    "loss": "deviance",
    "max_depth": 7,
    "max_features": "log2",
    "max_leaf_nodes": null,
    "min_samples_leaf": 1,
    "min_samples_split": 2,
    "min_weight_fraction_leaf": 0.0,
    "n_estimators": 700,
    "presort": "auto",
    "random_state": null,
    "scale": true,
    "subsample": 1.0,
    "verbose": 0,
    "warm_start": false
},
"table": {
    "false": {
        "false": 15563,
        "true": 2551
    },
    "true": {
        "false": 457,
        "true": 962
    }
},
"precision": {
    "false": 0.971,
    "true": 0.274
},
"trained": 1491356274.077835,
"type": "GradientBoosting",
"version": "0.3.0"
}

Vandalism detection on OpenStreetMap is similar to vandalism detection on Wikidata, both are structured datasets. With Wikipedia, things are different due to the more free-flow nature of the text. I am curious to see how ORES, a machine learning as a service for Wikimedia projects for vandalism detection and removal worked for Wikidata. The following is what I found.

There are 3 main models for Wikidata:

Reverted - predicts whether an edit will eventually be reverted.
Damaging - predicts whether or not an edit causes damage
Goodfaith - predicts whether an edit was saved in good-faith

Datasets

It looks like there are 5,000 samples that are manually labelled and 20,000 samples that are auto-labelled.

Attributes

Looks like all the 3 kinds of models - reverted, damaging and goodfaith make use of the same set of features. The list of attributes can be found at the link below:

https://github.com/wiki-ai/editquality/blob/master/editquality/feature_lists/wikidatawiki.py

A bigger list of attributes can be found at the link below:

https://github.com/wiki-ai/wb-vandalism/blob/master/wb_vandalism/feature_lists/wikidata.py#L150

Models

Model tuning reports:

reverted
- https://github.com/wiki-ai/editquality/blob/master/tuning_reports/wikidatawiki.reverted.md
- Mean: 0.976, stddev: 0.002
- Model parameters: criterion="entropy", n_estimators=640, min_samples_leaf=1, max_features="log2"
goodfaith
- https://github.com/wiki-ai/editquality/blob/master/tuning_reports/wikidatawiki.goodfaith.md
- Mean: 0.983, stddev: 0
- Model parameters: learning_rate=0.1, n_estimators=300, max_depth=5, max_features="log2"
damaging
- https://github.com/wiki-ai/editquality/blob/master/tuning_reports/wikidatawiki.damaging.md
- Mean: 0.99, stddev: 0.001
- Model parameters: learning_rate=0.01, max_features="log2", n_estimators=700, max_depth=7

Models for both Wikipedia and Wikidata get prepared together with a MakeFile. Where datasets are downloaded, features extracted, models trained and reports are generated.

https://github.com/wiki-ai/editquality/blob/master/Makefile#L2791-L2922

Properties about the model deployed can be viewed at the link below:

https://ores.wikimedia.org/scores/wikidatawiki/reverted/

This has been super-helpful. No next actions here. Closing.

mapbox / gabbar