Open azhe825 opened 6 years ago
can we get their data. e,g, he TREC 4 AdHoc collection, and a dataset consisting of 401,960 email messages that were manually reviewed and classified by a single individual, Roger, in his official capacity as Senior State Records Archivist.
sent an email requesting those datasets
Not for saving effort, but for pure precision and recall, human+machine is better than only human. Navigating Imprecision in Relevance Assessments on the Road to Total Recall: Roger and Me
Hypothesis: a human assessor can achieve on the order of 70% recall and 70% precision (better hypothesis than ours, should change experiments to this)
Cormack: This assumption motivates our choice to defer to a second assessor any document before the knee that is judged non-relevant by the user, and any document after the knee that is judged relevant by the user.