saulpw / aipl

Array-Inspired Pipeline Language
MIT License
119 stars 7 forks source link

chains/benchmarks, other LLMs #30

Closed dovinmu closed 1 year ago

dovinmu commented 1 year ago

(closing #28 in favor of this)

At minimum before we merge this branch:

for the binary classification script:

these are nice to haves, basically they'd make it so we were using the bigbench tasks more as intended:

If #27 is resolved then we could compute multiple accuracy metrics. And we could also optionally explore having another script that runs the entire slate of tasks on a set of models.

dovinmu commented 1 year ago

I ended up writing a few more ops for this PR, some of which seem obvious (!csv-parse) and some of which I expect will change (the !metrics- ones). I'm especially interested in getting the changes to llm.py merged so we have a way of calling LLMs from different sources