Are we able to feed HELM a table of LLM input/output instead of connecting to a model?

mzahorec commented 3 months ago

Hello, I have spreadsheets of input/output from language models. Is there any relatively straightforward way to run HELM metrics on data in this format? (That is, instead of connecting HELM to a language model and grabbing the output via API.) I was not able to find any mentions of this sort of task in any of the documentation or previous issues. Thank you.

yifanmai commented 3 months ago

This use case is currently not officially supported, unfortunately.

You consider try one of these approaches, which are not officially supported:

Write a Python script that imports your spreadsheet inputs and outputs into the results cache, and then run the full HELM pipeline, which should serve results from the cache. There is an old script you can refer to that does something similar.
Write a Python script that runs a partial HELM pipeline that constructs a list of RequestState and a ScenarioState from the inputs and outputs from your spreadsheets, and then runs the rest of the metric pipeline. You can use the HELM pipeline code as a reference.

mzahorec commented 3 months ago

I will try out these approaches. Thank you for your help!! @yifanmai

stanford-crfm / helm

Are we able to feed HELM a table of LLM input/output instead of connecting to a model? #2779