We have two kinds of test data. The ‘corpus’ data is large but randomly chosen. The regression data is diverse, specific but out of date. Locate metadata that covers a selection of content types to give a reliable cross-section of our features.
Definition of done:
All of the work types enumerated in documentation. e.g. about 15 types.
All of the supporting input types.
For each, there is a good number (e.g. 100) of works that implement relevant features including any quirks.
Tests parse, with manually verified that they are correctly represented in regression test suite.
We have two kinds of test data. The ‘corpus’ data is large but randomly chosen. The regression data is diverse, specific but out of date. Locate metadata that covers a selection of content types to give a reliable cross-section of our features.
Definition of done: