UKGovernmentBEIS / inspect_ai

Inspect: A framework for large language model evaluations
https://UKGovernmentBEIS.github.io/inspect_ai/
MIT License
385 stars 41 forks source link

Task metadata #50

Closed tekumara closed 1 week ago

tekumara commented 2 weeks ago

It would be nice to be able to store arbitrary task metadata surfaced somewhere like under task information: CleanShot 2024-06-17 at 22 19 53@2x

In our case we'd store the version numbers of different packages, who ran the eval, and a hostname.

aisi-inspect commented 2 weeks ago

Interestingly we have a metadata field in the EvalSpec (which is what this is drawing from) but no way to populate it! I wonder if should add a metadata field to Task and forward it here? (@dragonstyle LMK what you think)

tekumara commented 2 weeks ago

If what were displayed here were the task args, that would be swell and probably work well! As these are used to pass to the solver and define what runs and so worth surfacing in and of themselves. We could also use them to capture additional metadata.

jjallaire commented 2 weeks ago

@dragonstyle Don't we at least try to display the task_args? If we don't then we absolutely should (even if the are also displayed in config above)

dragonstyle commented 2 weeks ago

We are displaying task_args on the 'Info' tab (we render this metadata, so it should render simple values, but also things like dictionaries or arrays). Are you not seeing a 'Task Args' section (if you aren't, I should try to reproduce and figure out why!)?

tekumara commented 2 weeks ago

Yeh unfortunately I'm not seeing a Task Args section at all on the Info tab.

dragonstyle commented 1 week ago

Sorry to have let this languish! That section definitely should be present...

Thanks and sorry for the issue...

tekumara commented 1 week ago

Huh, I can't seem to reproduce this anymore. Happy to close this, sorry for the bother!

dragonstyle commented 1 week ago

Ok thanks for the followup and glad it appears to be working again!