preset-io / promptimize

Promptimize is a prompt engineering evaluation and testing toolkit.
Apache License 2.0
419 stars 32 forks source link

Minor possible fixes to requirements and response eval #4

Closed greenmtnboy closed 1 year ago

greenmtnboy commented 1 year ago

Noticed two issues when using this for the first time

  1. there's a pandas import that's not gated, but no pandas in requirements.txt and it didn't appear to be a transitive dependency. Added to requirements.txt, though it could also be gated in the report section.

  2. A prompt was passing in self result = evaluator(self) to the evaluator block, but it appears the evaluators are expecting a string - self.response seemed like the appropriate argument?

Would add test cases, but not sure what the preferred setup is for those.

mistercrunch commented 1 year ago

Hey, I get your point on passing the response only, but you may need a handle to other things for more complex use cases, say how long it took, the number of tokens in the response, or anything you'd like to capture in pre or post-processing of the prompt.

mistercrunch commented 1 year ago

Also I addressed the Panda dep in another commit. Thanks for the PR though.

mistercrunch commented 1 year ago

Oh FYI, in this example there's more complex post process, where we sandbox the python function and then reference it -> https://github.com/preset-io/promptimize/blob/main/examples/python_examples.py

I figure it's better for eval function to have access to the whole object.

greenmtnboy commented 1 year ago

Totally agree on principle; tactical question is that the eval functions from the README seem to expect a string; alternative approach here would be to alter them to handle the full object. I might have been misusing these though!

def all_words(response: str, words: List[str], case_sensitive: bool = False) -> int:
    PromptCase("hello there!", lambda x: evals.any_word(x, ["hi", "hello"])),