blip-solutions / promptwatch-client

Python client for PromptWatch.io - LLM tracking platform
https://www.promptwatch.io
28 stars 2 forks source link

Unit tests #16

Closed dr-taisya-it closed 1 year ago

dr-taisya-it commented 1 year ago

Hi. I want to use .json for tests. Like your documentation:

with UnitTest("the_name_of_the_test").for_test_cases_in_file(r"..\test_1.json") as test:
    for test_case in test.test_cases():
        test_case.evaluate(llmChain.run)

In what format should the test cases be presented for .jsonfiles?

ju-bezdek commented 1 year ago

Hi,

here is an code example how to generate the file as well as an example of the json...

the json file has to have one json per line! (not an array)

Code example

from promptwatch import PromptWatch, register_prompt_template
    from promptwatch.unit_tests import UnitTest
    from promptwatch.unit_tests.schema import LabeledOutput, TestCase
    from langchain import PromptTemplate, LLMChain
    from langchain.chat_models import ChatOpenAI

    tc = TestCase(
        for_template_name="my_template_name",
        for_tracking_project="test_project",
        inputs={"sentence":"I like to eat"},
        outputs=[
            LabeledOutput(label=0, value="I like to eat"), 
            LabeledOutput(label=1, value="I like to eat pizza"),
            LabeledOutput(label=1, value="I like to eat pizza and drink beer"),
        ]
    )
    with open("test.json","w") as f:
        f.write(tc.json())
    demo_template = register_prompt_template(template_name="my_template_name", # same as tc.for_template_name
                                             prompt_template=PromptTemplate.from_template("Finish this sentence: {sentence}"))
    llmChain = LLMChain(llm=ChatOpenAI(), prompt=demo_template)
    with UnitTest("the_name_of_the_test").for_test_cases_in_file("test.json") as test:
        for test_case in test.test_cases():
            test_case.evaluate(llmChain.run)

JSON Example

{
    "id": null,
    "for_template_name": "my_template_name", // name of the template to test - if testing particular template (to connect the test with the template)
    "for_tracking_project": "test_project", // name of your source code project 
    "inputs": {
        "sentence": "I like to eat" // list if inputs for the chain
    },
    "outputs": [
        {
            "label": 0,
            "value": "I like to eat" // expected output (can be also dict, depending on the chain implementation / type)
        },
        {
            "label": 1,
            "value": "I like to eat pizza"
        },
        {
            "label": 1,
            "value": "I like to eat pizza and drink beer"
        }
    ]
}

Important note! There was a small bug in loading the file code, which I noticed when I run the code above before sending it to you, Please upgrade to v0.2.8 before trying to load Tests from file

dr-taisya-it commented 1 year ago

I used your version of the code, but I'm getting an error. Could you help me, please?

Traceback (most recent call last):
  File "C:\PROJECTS\bots\ai-bot\promptwatch_test.py", line 30, in <module>
    for test_case in test.test_cases():
  File "C:\Users\drtai\AppData\Local\Programs\Python\Python310\lib\site-packages\promptwatch\unit_tests\unit_tests.py", line 243, in test_cases
    for example in tqdm(self._test_cases_generator, desc=f"Running test {self.test_name}"):
  File "C:\Users\drtai\AppData\Local\Programs\Python\Python310\lib\site-packages\tqdm\std.py", line 1178, in __iter__
    for obj in iterable:
  File "C:\Users\drtai\AppData\Local\Programs\Python\Python310\lib\site-packages\promptwatch\unit_tests\unit_tests.py", line 52, in iterate
    test_case = TestCase(data)
  File "pydantic\main.py", line 332, in pydantic.main.BaseModel.__init__
TypeError: __init__() takes exactly 1 positional argument (2 given)
ju-bezdek commented 1 year ago

Yes, that is the bug I've mentioned in the note:

Please upgrade to v0.2.8 before trying to load Tests from file

dr-taisya-it commented 1 year ago

Thank you! It's working.