UKGovernmentBEIS / inspect_ai

Inspect: A framework for large language model evaluations
https://UKGovernmentBEIS.github.io/inspect_ai/
MIT License
385 stars 41 forks source link

First party support for Inspect on Langtrace #45

Closed karthikscale3 closed 2 weeks ago

karthikscale3 commented 3 weeks ago

Hi all,

It's me again - core maintainer of Langtrace(https://github.com/Scale3-Labs/langtrace). Wanted to share that, we added first party support for Inspect on Langtrace. What does this mean?

  1. Langtrace users can now run evals using Inspect against their traced and curated datasets from Langtrace.
  2. The eval results can be reported back to Langtrace for storage and viewing. We have built some nice features for reporting like comparison mode, and have few more things in the hopper.

How did we do this? As suggested https://github.com/UKGovernmentBEIS/inspect_ai/issues/39#issuecomment-2156812620, we went ahead and implemented a fsspec filesystem implementation which takes care of reading datasets and writing logs to our backend.

You can find our docs here:

Attaching some screenshots: image image

Thanks again for building this tool and also giving us a path forward for integrating it. We would love to hear your feedback/thoughts.

jjallaire commented 2 weeks ago

Thanks fantastic! Thanks so much for sharing.

One thing you might want to add to your docs is using Inspect's support for .env files to set both the LANGTRACE_API_KEY and INSPECT_LOG_DIR (then they don't need to specify it explicitly in the CLI for either eval or view. More docs on this here: https://ukgovernmentbeis.github.io/inspect_ai/workflow.html#sec-workflow-configuration

karthikscale3 commented 2 weeks ago

Thanks @jjallaire . That's a great feedback. I will update our docs accordingly.

aisi-inspect commented 2 weeks ago

Thanks again for letting us know about this, will close the issue as I think you've gotten the feedback we had for now.