Closed HashemAlsaket closed 1 year ago
Added starter code for getting best response from set of LLMs and params. Example:
from prompttools.experiment.llms.compare_responses import MaxLLM
hf_model_repo_ids = [
"google/flan-t5-xxl",
"databricks/dolly-v2-3b",
"bigscience/bloom",
]
temperatures = [0.01, 1.0]
max_lengths = [17, 32]
LLMs = MaxLLM(
hf_repo_ids=hf_model_repo_ids,
temperatures=temperatures,
max_lengths=max_lengths,
question="Who was the first president of the USA?",
expected="George Washington",
)
LLMs.run()
Output:
LLMs.best_response()
Response(repo_id='google/flan-t5-xxl', temperature=0.01, max_length=17, score=1.0000001192092896, response='george washington')
LLMs.top_n_responses(n=9)
[Response(repo_id='google/flan-t5-xxl', temperature=0.01, max_length=17, score=1.0000001192092896, response='george washington'),
Response(repo_id='google/flan-t5-xxl', temperature=0.01, max_length=32, score=1.0000001192092896, response='george washington'),
Response(repo_id='google/flan-t5-xxl', temperature=1.0, max_length=17, score=1.0000001192092896, response='george washington'),
Response(repo_id='google/flan-t5-xxl', temperature=1.0, max_length=32, score=1.0000001192092896, response='george washington'),
Response(repo_id='databricks/dolly-v2-3b', temperature=0.01, max_length=32, score=0.6773853898048401, response='\nPresident George Washington\n\nPresident Washington was the first president of the United States'),
Response(repo_id='databricks/dolly-v2-3b', temperature=1.0, max_length=32, score=0.6773853898048401, response='\nPresident George Washington\n\nPresident Washington was the first president of the United States'),
Response(repo_id='bigscience/bloom', temperature=0.01, max_length=17, score=0.6547670364379883, response=' George Washington\n Question: What is the capital of the USA?\n Answer: Washington, DC\n '),
Response(repo_id='bigscience/bloom', temperature=0.01, max_length=32, score=0.2622990906238556, response=' George Washington\n """\n self.assertEqual(self.parser.parse(question), [(\''),
Response(repo_id='bigscience/bloom', temperature=1.0, max_length=17, score=0.2622990906238556, response=' George Washington\n """\n self.assertEqual(self.parser.parse(question), [(\'')]
@steventkrawczyk ready for review. Needed a few iterations to get acclimated with the code. Good now.
Output looks good, too.
Looks great! Would you be able to fill out the CLA? Then I can merge this change in later today 🚀
Sounds good. I think I can use a similar template to include anthropic, azure, etc. I'll try to write up some issues for them today.
Starting as a draft to get an idea of how #10 should mature. I started with a unit test-style testing.
@steventkrawczyk @NivekT