Error in experiment.evaluate() in introductory example OpenAIChatExperiment.ipynb

hegelai / prompttools

Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).

http://prompttools.readthedocs.io

Apache License 2.0

2.56k stars 216 forks source link

Error in experiment.evaluate() in introductory example OpenAIChatExperiment.ipynb #76

Closed davidtan-tw closed 11 months ago

davidtan-tw commented 11 months ago

Hi folks, thanks for creating this tool.

I'm trying out prompttools and was following the introductory example (OpenAIChatExperiment.ipynb) listed on the quickstart page and encountered this error. I can reproduce the error locally and on the provided Colab notebook

🐛 Describe the bug

This is the line that raises an error:

experiment.evaluate("similar_to_expected", similarity.evaluate, expected="George Washington")

And this is the error: TypeError: evaluate() missing 2 required positional arguments: 'response' and 'metadata'

davidtan-tw commented 11 months ago

Managed to fix error by importing the right function (semantic_similarity) from the right package (prompttools.utils). See fix:

Would it be right to say that the OpenAIChatExperiment.ipynb demo was importing the wrong function (prompttools.utils.similarity.evaluate)?

davidtan-tw commented 11 months ago

And as an aside, why is the semantic similarity between "George Washington" and "George Washinton" ranging between 0.14 to 0.35? I would expected something like 0.99 or even 1.0

davidtan-tw commented 11 months ago

Alright, so it turns out that was because I should have passed in a List of expected results (expected=["George Washington"] * 4) instead of a string:

I think it would be better if the function threw an error due to type mismatch, telling the user that it was expecting List[str] instead of str, rather than failing silently with random semantic similarity scores. What do you think?

(As an aside, the reference code in the repo (https://github.com/hegelai/prompttools/blob/main/examples/notebooks/OpenAIChatExperiment.ipynb) works as expected but the Colab notebook is outdated and will throw an error)

steventkrawczyk commented 11 months ago

Hey David, thanks for trying prompttools and mentioning these issues. I believe everything's up to date for that example on the HEAD of main

https://github.com/hegelai/prompttools/blob/main/examples/notebooks/OpenAIChatExperiment.ipynb

We'll make the error handling better.