pwyf / 2014-technical-consultation

Consultation for the 2014 Aid Transparency Index test
MIT License
2 stars 0 forks source link

Sampling documents #39

Open markbrough opened 10 years ago

markbrough commented 10 years ago

Issues

In the 2013 Index, it was possible to tag activities with documents which didn't contain the information being sought in two ways:

  1. providing a link to that did not provide the specific document being sought (for example, link to a general website that may or may not contain a document for the specific activity)
  2. providing a link to a document specific to the activity, without the document containing the information for the indicator (e.g. a general project document, tagged under contract, but not containing any contract information).

Questions

  1. How can we accurately measure whether donors are correctly tagging documents, and providing the information requested for specific activities?
  2. Should we sample documents to test whether they contain the stated information?

By sampling we mean: randomly selecting a sample of documents for manual checks.

2014 Index We are proposing to sample documents to manually check whether they contain the information requested. We are considering how this would be incorporated into the scoring methodology.

ErinCoppin commented 10 years ago

Yes, I agree that tags for documents should be sampled to see whether donors are correctly tagging documents and that the documents contain the looked-for information. PWYF is monitoring IATI data quality as part of its raison de etre and this must be considered a priority area for testing.

I would suggest that every organisation-level document is tested for all donors, and for the largest recipient 3 activities have their activity level document tags checked. Errors should be communicated to the donor in question.

In terms of scoring on the Index, some of the indicators are almost wholly about ensuring that a document has been correctly tagged in IATI and contains the correct information. If all 3 pass then the full score is allowed. If 2 pass, then 2/3 score, if 1 passes then 1/3 score, if none pass then no score for the indicator. For documents that are not part of the indicator (such as associated project documents that have been included for completeness) these should not affect the indicator score but the donor should be notified.

bill-anderson commented 10 years ago

Activity-level document links should relate to documents that are about that specific activity, or contain a clearly identified section dealing with the specific activity. Using activity-level links for country- or agency-level documents is generally not acceptable.

johnadamsDFID commented 10 years ago

We support point sampling of documents. Please make sure the sample size is large enough to avoid anomalous results.

akshaysinha commented 10 years ago

Hi, one of the issues with the tester in general regarding documents is around the quality testing of content and format of document. I have outlined the points below:

  1. Although, the format and hyperlink is a required field, any content can be pushed through to get a validated xml and to get points. For example, "http://organizationhomepage.html" format="html" can be copied over and over across xmls and still manage to get 100% points. It would be worthwhile to restrict use of same link across the same xml or across xml- a distinct count logic test would be simple test.
  2. The more difficult test which I am still thinking about is how to test pure quality of document and if it is ok to use generic homepages or project pages (html formats) for some doc types. At times I do think it is genuinely good, for example a project page might contain Objectives / Purpose of activity in the body of the project page or the page might also highlight the project budget, beneficiaries, and project results and evaluation information. However, at times the page itself might not contain much information and might simply be a landing page to download documents about the page, in which case it is not the link that should be provided on the xml. This requires some manual testing maybe and/or further thinking to improve the automated tester on quality of document links.
  3. Also, the document-link format is somewhat redundant in my opinion (but others might disagree) since this is derived anyway from the hyperlink address. For example, document.html is html document.doc is msword etc.

Thank you, Akshay

markbrough commented 10 years ago

Thanks for all the comments.

For 2014

A total of 14 indicators refer to documents. These documents are manually checked to verify that they contain the required information to score for the indicator. For IATI publishers, the documents may be located via links in their XML files.

10 documents will be randomly sampled from organisations' IATI files, with a minimum of five documents needing to meet the criteria for the indicator.

For organisation level documents where only a single document is expected, the document will be checked to see if it contains the required information to score on the indicator.

We will also be sampling data on results, sub-national location and conditions.