issues
search
UKGovernmentBEIS
/
inspect_ai
Inspect: A framework for large language model evaluations
https://UKGovernmentBEIS.github.io/inspect_ai/
MIT License
379
stars
41
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Required Docker Version
#69
kaifronsdal
opened
1 day ago
0
[Question] Display dataset record(Sample) metadata in the viewer main page
#68
us2547
opened
1 day ago
0
[Snyk] Security upgrade anyio from 3.7.1 to 4.4.0
#67
lorenzoAtBEIS
closed
1 day ago
1
Incorrect targets in theory of mind dataset
#66
jrhender
closed
1 day ago
3
docs for `Sample`'s `target` parameter not consistent with code
#65
Lovkush-A
closed
1 day ago
2
include 'MemoryDataset' in the docs
#64
Lovkush-A
closed
1 day ago
2
CoT isn't saved in logs for multiple choice questions
#63
js-d
opened
4 days ago
0
Fix Markdown warnings
#62
bact
closed
2 days ago
2
Remove extra newlines / Sort requirements.txt
#61
bact
closed
2 days ago
0
Creating too many docker containers when running intercode-ctf example throws error
#60
kaifronsdal
closed
1 day ago
7
[Snyk] Security upgrade anyio from 3.7.1 to 4.4.0
#59
lorenzoAtBEIS
closed
2 days ago
0
fix(HuggingFaceAPI): send inputs to model device
#58
connorsmith256
closed
1 week ago
0
Typo in _generate_config.py
#57
Xodarap
closed
1 week ago
2
Tool environment link is broken
#56
Xodarap
closed
1 week ago
2
Add optional tool param "properties" to adhere to OpenAI spec
#55
bjoernpl
closed
1 week ago
3
Fix spacing in eval tool environment example
#54
tbroadley
closed
1 week ago
0
Support retrieval as a first-class element
#53
tekumara
opened
2 weeks ago
1
[Bug] Inspect view errors out when port forwarding to laptop's web browser
#52
alex-treebeard
closed
1 week ago
3
[Bug] Eval fails with Permission denied: '/run/user' when running from within a devcontainer
#51
alex-treebeard
opened
2 weeks ago
3
Task metadata
#50
tekumara
closed
4 days ago
8
HTML tags stripped out in the sample viewer
#49
Suhas-13
closed
2 weeks ago
2
Support for evals within a module
#48
adrianlyjak
opened
2 weeks ago
7
Non-scrollable content when only one test case
#47
adrianlyjak
closed
2 weeks ago
2
Start example
#46
canyon289
closed
1 week ago
1
First party support for Inspect on Langtrace
#45
karthikscale3
closed
2 weeks ago
3
log viewer: format target using markdown
#44
vsiva
closed
2 weeks ago
3
Better error messages
#43
lukasberglund
closed
1 day ago
5
Error when getting logprobs for EOF token from OpenAI API
#42
lukasberglund
closed
2 weeks ago
2
no attribute '__mutable_keys__' on python 3.11 when running the CLI
#41
adrianlyjak
closed
3 weeks ago
1
Would more sophisticated request scheduling mechanisms be valuable?
#40
schmatz
opened
3 weeks ago
7
Is there a way to send eval logs to an API endpoint?
#39
karthikscale3
closed
3 weeks ago
5
Inspect view is missing column headers
#38
tekumara
closed
2 weeks ago
3
Inspect view does not show input
#37
tekumara
closed
3 weeks ago
2
inspect eval with task name - IndexError: list index out of range
#36
tekumara
closed
3 weeks ago
2
Support bedrock/meta.llama3-70b-instruct-v1
#35
tekumara
closed
3 weeks ago
1
Is there any way to run this with model from Azure OpenAI
#34
Sandy8103
closed
3 weeks ago
1
Is there a recommended way to run a plan outside an eval?
#33
voberoi
closed
3 weeks ago
3
Add skip marker for failing test
#32
canyon289
closed
4 weeks ago
3
remove test package
#31
aisi-inspect
closed
1 month ago
0
explicitly opt in to include_package_data
#30
aisi-inspect
closed
1 month ago
0
try previous version
#29
aisi-inspect
closed
1 month ago
0
troubleshooting missing data file
#28
aisi-inspect
closed
1 month ago
0
Re-scoring without re-running solver
#27
sohaibimran7
closed
1 month ago
1
multi epoch runs do not report per epoch metrics
#26
sohaibimran7
opened
1 month ago
1
toy example for multi-turn dialogues using inspect
#25
Lovkush-A
closed
3 weeks ago
5
Fixed issue #23: Added the @scorer decorator to the multi_scorer function.
#24
sohaibimran7
closed
1 month ago
0
Using multi_scorer raises AttributeError
#23
sohaibimran7
closed
1 month ago
4
understanding `input_text` versus `user_prompt`
#22
Lovkush-A
closed
3 weeks ago
4
minor typos in dataset docs
#21
Lovkush-A
closed
1 month ago
2
Typo in @scorer decorator
#20
novatc
closed
1 month ago
2
Next