Support for dspy programs

dkshjn commented 8 months ago

closes #3

sutyum commented 8 months ago

A script (cli) which uses benchmark.py as a import (not as a cli).

Inside the repo with a dspy program (such as medprompt):

pip install git+https://github.com/Technoculture/med-llm-autoeval

# benchmark_medprompt.py

import argparse
from benchmark import test
from medprompt import MedpromptModule

# class MedpromptModule(dspy.Module):
#   def __init__(self):
#     ...
# 
#  def forward(self, ...):
#     ...

if __name__ == "__main__":
  ... some argparse code ...

  results = test(
    dspy_module=MedpromptModule
    benchmark="openllm"
  )

  print(results)

Both a prompt testing script (like medprompt.py, etc) and the benchmark.py can be used as cli scripts due to the code in their if __name__ == ... blocks as well as imported files.

sutyum commented 8 months ago

[x] poetry to setup a project
[x] benchmark.py - the main module in this library
[ ] evauluate.py - python function to conduct an evaluation run
[ ] scripts/example.py <-- how to use this library to write a script for testing a prompt module
[x] Use ruff to format, lint and type check your code

sutyum commented 8 months ago

Share a loom showing the script working

dkshjn commented 8 months ago

The following video explains the process. https://www.loom.com/share/6c7e4eed82764d31b4bf4a6a859ac295?sid=160b0889-417f-4604-a758-5488df2b10e1

Technoculture / med-llm-autoeval

Support for dspy programs #4