Open hortongn opened 1 year ago
Goal
Corpus
Is the corpus large enough? Is the training set large enough?
What are the start and end dates for the data in the corpus? Does this matter?
Who chose the corpus, when was it chosen and for what purpose? Details of the corpus used, like the data for a research article, should be publicly stated and accessible.
What is the corpus bias?
Is the tool likely to raise diversity, equality and/or inclusion issues?
Is personal data captured and reused?
Algorithm
Evaluation and Metrics
Have I measured the current process before introducing any change, for example, time taken, number of errors?
Who to evaluate: end users or subject-matter experts, or both? Internal or external?
What metrics will be used to evaluate the tool? The F1 score, if used, must be interpreted in context.
Sanity check
Sanity check/common sense: Have the developers built in ‘common-sense’ limitations to prevent the algorithm being applied too widely? Am I asking a meaningful question? Is this a feasible exercise?
Does the tool provide feedback when a question is out of scope?
Based on the checks above, is the tool fit for purpose?
Dissemination
Is there easy-to-read documentation and guidance for new users that explains in simple terms how to use the tool and how it improves on current processes?
Feedback
Does the tool provide a feedback loop so it can be improved over time?
Now that we have some potential tools and solutions, try them out to get a feel for how they work and whether they might work for our project.
Tools we should try (add more):
Post your findings in the comments