issues
search
getappmap
/
navie-benchmark
Navie benchmarks
MIT License
0
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
feat: Document architecture
#98
kgilpin
opened
1 week ago
0
Fix test environments
#97
dividedmind
closed
1 week ago
0
fix: Whitespace adjuster
#96
kgilpin
closed
2 weeks ago
0
fix: Retry file choosing
#95
dividedmind
closed
1 week ago
0
Claude produces mangled change output
#94
kgilpin
opened
3 weeks ago
1
Rebase our benchmark to the current main of Princeton NLP
#93
kgilpin
opened
3 weeks ago
1
fix: Malformed test error
#92
kgilpin
closed
3 weeks ago
0
fix: Strip backticks from around listed files
#91
kgilpin
closed
3 weeks ago
0
fix: Use more robust test file/dir patterns
#90
kgilpin
closed
3 weeks ago
0
include and exclude patterns should be more specific
#89
kgilpin
closed
3 weeks ago
1
Test error in prompt is malformed
#88
kgilpin
closed
3 weeks ago
1
Increase temperature on retries of test file and code file searches
#87
kgilpin
opened
3 weeks ago
4
Strip fences from test and code list output
#86
kgilpin
closed
3 weeks ago
2
Sonnet output mixes up path prefixes
#85
kgilpin
opened
3 weeks ago
2
Create a concise report of 0-scoring instances
#84
kgilpin
opened
3 weeks ago
1
LLM output mis-indents the first line of the patch
#83
kgilpin
closed
3 weeks ago
2
Ordinal not in range
#82
kgilpin
opened
3 weeks ago
1
Recommended code patch file is missed / not understood
#81
kgilpin
closed
3 weeks ago
3
Report tests that fail or error during initial enumeration
#80
kgilpin
closed
3 weeks ago
4
Unrecognized arguments: --observe_synthetic_tests
#79
kgilpin
closed
3 weeks ago
1
fix: Test file name may be a directory
#78
kgilpin
closed
3 weeks ago
0
fix: Extract file names from mixed content
#77
kgilpin
closed
3 weeks ago
0
Handle file name suggestions in mixed content
#76
kgilpin
closed
3 weeks ago
1
Retry lint repair
#75
dividedmind
closed
3 weeks ago
0
feat: Upgrade search
#74
kgilpin
closed
3 weeks ago
0
feat: claude-3-5-sonnet-20241022
#73
kgilpin
closed
3 weeks ago
0
Gemini sometimes generates straight diff
#72
dividedmind
opened
1 month ago
4
Solver errors if the generated test name is a directory
#71
kgilpin
closed
3 weeks ago
1
Gemini emits backticks instead of the end of a CDATA section
#70
kgilpin
closed
1 month ago
1
feat: Gemini
#69
kgilpin
closed
3 weeks ago
0
Gemini outputs an HTML comment within Python block
#68
kgilpin
closed
3 weeks ago
1
chore: Update appmap-js to `main` (e5ced70)
#67
dustinbyrne
closed
1 month ago
1
data: Phase IV
#66
kgilpin
closed
2 months ago
0
solve_loop can stop using import_test_data
#65
kgilpin
opened
2 months ago
3
feat: Report recursive
#64
kgilpin
closed
2 months ago
0
Test errors in `generate_code` are malformed
#63
dustinbyrne
opened
2 months ago
2
feat: Official workflow for a full solver
#62
kgilpin
closed
2 months ago
0
feat: naive-editor Trim single backticks from single-line content
#61
kgilpin
closed
2 months ago
0
Set APPMAP_NAVIE_MINI_MODEL as appropriate
#60
kgilpin
opened
2 months ago
1
feat: Update appmap-js to sonnet fixes
#59
kgilpin
closed
1 month ago
2
Propagate test failure messages from previously generated test case attempts to the code generator
#58
kgilpin
opened
2 months ago
1
Accept TestStatus.ERROR for inverted test case if the marker error message is observed
#57
kgilpin
opened
2 months ago
2
feat: o1
#56
kgilpin
closed
1 month ago
0
fix: appmap-js fix/retry-claude-overload
#55
kgilpin
closed
2 months ago
0
compare: gpt-4o
#54
kgilpin
closed
2 months ago
0
Connection error
#53
kgilpin
closed
1 month ago
4
nomerge: LLM compare
#52
kgilpin
opened
2 months ago
0
ci: Update solve.yml
#51
kgilpin
closed
2 months ago
0
Write a script to prepare our submission
#50
kgilpin
opened
2 months ago
1
feat: Assertion hints
#49
kgilpin
closed
2 months ago
1
Next