Empowering Nurses with Multilingual ICU Protocols. Leveraging the rapid advancements in AI technology, created multilingual interfaces that assist nurses in rapidly upgrading their knowledge about ICU protocols.
We need to automate our testing, which is currently happening through google sheets.
An admin can create a TestSuite, and TestQuestions under it. The TestSuite can be configured to have different temperature and topk
A TestQuestion will contain a question that will be asked to Ayushma, and an answer that will have a human entered answer for the question. We need to test if Ayushma's response is similar to the answer and how similar.
Once the admin triggers a test run, a new TestRun instance will be created linked to a Project and the TestSuite. The suite will run async through celery.
The test will perform each associated TestQuestion and create a TestResult which will have the question and human_answer from the TestQuestion (do not link the models, because the questions can change), answer that was returned by Ayushma, the timetaken, cosine_sim and bleu_score
Once all questions have been answered, we need to calculate the cosign sim and bleu score for the result on the scale of 0 to 1. After this, the test will be over and the admin can see the results.
Now, update the user model to have an is_reviewer field. If a user has is_reviewer to be true, they can access the TestRuns and their TestResults and add feedback to the results. They will be able to create a new Feedback for a TestResult by entering a rating (Excellent, Good, Satisfactory, Unsatisfactory, Wrong or Hallucinating) and a note.
The Reviewer can only see their own feedback. They cannot edit them later.
Only the Admins can see the Feedbacks of all reveiwers.
In the end, your new models should look like this (Models will be extending the base model class)
We need to automate our testing, which is currently happening through google sheets.
An admin can create a
TestSuite
, andTestQuestion
s under it. TheTestSuite
can be configured to have differenttemperature
andtopk
A
TestQuestion
will contain aquestion
that will be asked to Ayushma, and ananswer
that will have a human entered answer for the question. We need to test if Ayushma's response is similar to theanswer
and how similar.Once the admin triggers a test run, a new
TestRun
instance will be created linked to aProject
and theTestSuite
. The suite will run async through celery. The test will perform each associatedTestQuestion
and create aTestResult
which will have thequestion
andhuman_answer
from theTestQuestion
(do not link the models, because the questions can change),answer
that was returned by Ayushma, thetimetaken
,cosine_sim
andbleu_score
Once all questions have been answered, we need to calculate the cosign sim and bleu score for the result on the scale of 0 to 1. After this, the test will be over and the admin can see the results.
Now, update the user model to have an
is_reviewer
field. If a user hasis_reviewer
to be true, they can access theTestRun
s and theirTestResult
s and add feedback to the results. They will be able to create a newFeedback
for aTestResult
by entering arating
(Excellent, Good, Satisfactory, Unsatisfactory, Wrong or Hallucinating) and anote
.The Reviewer can only see their own feedback. They cannot edit them later. Only the Admins can see the
Feedback
s of all reveiwers.In the end, your new models should look like this (Models will be extending the base model class)
TestSuite
name
temperature
topk
TestQuestion
test_suite
(fk)question
human_answer
TestRun
test_suite
(fk)project
complete
(default false)TestResult
test_run
(fk)test_question
(fk)question
human_answer
answer
cosine_sim
bleu_score
Feedback
test_result
(fk)rating
(integer choice field)notes
cc. @bodhish