uclibs / AI-Project

Planning for App Dev AI project
0 stars 0 forks source link

Test and play with some of the example models and services we've found #11

Open hortongn opened 1 year ago

hortongn commented 1 year ago

Now that we have some potential tools and solutions, try them out to get a feel for how they work and whether they might work for our project.

Tools we should try (add more):

Post your findings in the comments

scherztc commented 1 year ago

Goal

  1. What is a realistic goal? Expecting perfection for an AI utility is impossible. AI tools based on a training set cannot have 100% accuracy. Nonetheless, the accuracy they provide should be considerably greater than using humans for the same task.

Corpus

  1. Is the corpus large enough? Is the training set large enough?

  2. What are the start and end dates for the data in the corpus? Does this matter?

  3. Who chose the corpus, when was it chosen and for what purpose? Details of the corpus used, like the data for a research article, should be publicly stated and accessible.

  4. What is the corpus bias?

  5. Is the tool likely to raise diversity, equality and/or inclusion issues?

  6. Is personal data captured and reused?

Algorithm

  1. Have the developers provided a single-sentence summary of the methodology behind the algorithm?

Evaluation and Metrics

  1. Have I measured the current process before introducing any change, for example, time taken, number of errors?

  2. Who to evaluate: end users or subject-matter experts, or both? Internal or external?

  3. What metrics will be used to evaluate the tool? The F1 score, if used, must be interpreted in context.

Sanity check

  1. Sanity check/common sense: Have the developers built in ‘common-sense’ limitations to prevent the algorithm being applied too widely? Am I asking a meaningful question? Is this a feasible exercise?

  2. Does the tool provide feedback when a question is out of scope?

  3. Based on the checks above, is the tool fit for purpose?

Dissemination

Is there easy-to-read documentation and guidance for new users that explains in simple terms how to use the tool and how it improves on current processes?

Feedback

Does the tool provide a feedback loop so it can be improved over time?