lucyparsons / OpenOversight

Police oversight and accountability through public data 👮
https://openoversight.com
GNU General Public License v3.0
240 stars 79 forks source link

Double data entry for the sorting task #187

Open redshiftzero opened 7 years ago

redshiftzero commented 7 years ago

I got the following suggestion after giving a short talk on OpenOversight: one way to automatically flag malicious people as well as catch errors is to use double data entry. In double data entry, as the name suggests, each image would be reviewed by two different people. If a particular user has a very high error rate they can be automatically flagged and removed from the system.

This issue is to enable this for the sorting task, and then potentially at a later date we can do this for the tagging task. We need:

r4v5 commented 7 years ago

An example threshold would be users that have flagged more than 10 images and more than 50% differ from the person that also tagged the image.

the problem with 2-entry consensus is that for every image that a malicious Mallory and valid Victoria tag or sort, both Mallory and Victoria get this ding, and it's possible to be incorrectly flagged as malicious by virtue of being paired with a bunch of malicious actors. In the smalll scale we're operating, against the kind of adversary we're expecting, I'd rather just have the same image tagged/sorted 3x by different users and surface tags that don't have consensus to admins for review.