coronasafe / ayushma

Empowering Nurses with Multilingual ICU Protocols. Leveraging the rapid advancements in AI technology, created multilingual interfaces that assist nurses in rapidly upgrading their knowledge about ICU protocols.
https://ayushma-api.ohc.network
MIT License
7 stars 8 forks source link

Implement a testing framework #114

Closed skks1212 closed 10 months ago

skks1212 commented 11 months ago

We need to automate our testing, which is currently happening through google sheets.

An admin can create a TestSuite, and TestQuestions under it. The TestSuite can be configured to have different temperature and topk

A TestQuestion will contain a question that will be asked to Ayushma, and an answer that will have a human entered answer for the question. We need to test if Ayushma's response is similar to the answer and how similar.

Once the admin triggers a test run, a new TestRun instance will be created linked to a Project and the TestSuite. The suite will run async through celery. The test will perform each associated TestQuestion and create a TestResult which will have the question and human_answer from the TestQuestion (do not link the models, because the questions can change), answer that was returned by Ayushma, the timetaken, cosine_sim and bleu_score

Once all questions have been answered, we need to calculate the cosign sim and bleu score for the result on the scale of 0 to 1. After this, the test will be over and the admin can see the results.

Now, update the user model to have an is_reviewer field. If a user has is_reviewer to be true, they can access the TestRuns and their TestResults and add feedback to the results. They will be able to create a new Feedback for a TestResult by entering a rating (Excellent, Good, Satisfactory, Unsatisfactory, Wrong or Hallucinating) and a note.

The Reviewer can only see their own feedback. They cannot edit them later. Only the Admins can see the Feedbacks of all reveiwers.

In the end, your new models should look like this (Models will be extending the base model class)

TestSuite

TestQuestion

TestRun

TestResult

Feedback

cc. @bodhish

skks1212 commented 11 months ago

Update : For now, we don't want the review UI, just download the test output as csv