issues
search
princeton-nlp
/
SWE-bench
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
https://www.swebench.com
MIT License
1.44k
stars
238
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Handle failures because of None/empty patches
#169
klieret
opened
5 hours ago
1
Update reporting and skip empty model patch predictions
#168
carlosejimenez
closed
8 hours ago
1
Failing benchmark instances
#167
aorwall
opened
9 hours ago
2
Fix newline outputs for django's log parser
#166
xingyaoww
closed
8 hours ago
1
Passed test case count as failure?
#165
xingyaoww
closed
8 hours ago
0
Fix so it doesn't crash when no env imgs to build
#164
JunShern
closed
12 hours ago
1
Missing `validation.ipynb`?
#163
xingyaoww
closed
6 hours ago
1
Fix evaluation hanging issue and improve patch apply
#162
xingyaoww
closed
1 day ago
2
Fix path to image in docs
#161
klieret
closed
1 day ago
0
`exec_run_with_timeout` does not actually kill long-running thread
#160
klieret
opened
3 days ago
1
Add timeout for overal execution of instance
#159
klieret
closed
3 days ago
1
swe-bench can get badly stuck in `future.result()`
#158
klieret
closed
1 day ago
2
docker evaluation gets stuck
#157
crhf
opened
3 days ago
3
Which Python version to use?
#156
anupamme
opened
4 days ago
2
Fix link to collection tutorial
#155
klieret
closed
1 day ago
0
It seems that current evaluation does not handle the apply failure case?
#154
Hodge931
opened
4 days ago
1
Various nitpicks
#153
klieret
closed
1 day ago
1
Add very simple CI
#152
klieret
closed
7 hours ago
2
Fix: Support JSON datasets (avoid loading json twice)
#151
klieret
closed
1 day ago
0
Cannot load dataset from JSON file
#150
klieret
closed
1 day ago
0
Interface fix: run_id is required
#149
klieret
closed
1 day ago
0
Add a `schema_version: 2` field to evaluation output files
#148
klieret
opened
4 days ago
2
Missing metric/report.py
#147
donggrame
closed
11 hours ago
2
inference part of project installs in plain packages list
#146
AnikinNN
opened
4 days ago
0
Get error "error: corrupt patch at line 40" when using the gold patch of "django__django-15202"
#145
BoxiYu
opened
4 days ago
3
Running into errors during evaluation
#144
ivan4722
opened
5 days ago
5
Can't test installation in setup, I get error
#143
ivan4722
closed
5 days ago
2
Containerize SWE-bench evaluation
#142
carlosejimenez
closed
5 days ago
1
Problem with conda install gxx_linux-64 gcc_linux-64 make -y on OSX
#141
ivan4722
closed
5 days ago
0
Test cases
#140
Hodge931
closed
1 week ago
1
Skipped test cases
#139
Hodge931
closed
1 week ago
1
Distinguish Between Verified and Unverified Solutions
#138
thisdotmatt
closed
1 week ago
3
Problem in sympy__sympy-13773
#137
Hodge931
closed
5 days ago
2
Reproducing the tests using run_evaluation.py
#136
nasr020
closed
5 days ago
1
Where can I find `swe-bench.json`?
#135
yorhaha
closed
2 weeks ago
2
Add pytest to pydicom requirements
#134
Danila89
closed
2 weeks ago
1
When and how should `hints_text` be used?
#133
atinylittleshell
closed
2 weeks ago
1
Inference and evaluate on SWE-Bench faster by reusing previous built env
#132
Yuzz1020
closed
3 weeks ago
1
Consider the presence of downstream information in `problem_statement`
#131
dustinbyrne
closed
3 weeks ago
0
Update astropy pre_install for only 4.0> versions
#130
carlosejimenez
closed
4 weeks ago
0
Fix astropy installation after setuptools updated to 70.0.0
#129
carlosejimenez
closed
1 month ago
0
Clarification Needed on Removal of Instances with Error Message Checks in SWE-bench Lite Dataset
#128
ramsey-coding
closed
2 weeks ago
2
Install failed on instances from astropy__astropy
#127
JiyangZhang
closed
1 month ago
3
`model_name_or_path` is None when running models without adapters, causing an error in `run_evaluation.py`
#126
rucnyz
closed
2 weeks ago
2
what's the difference between environment_setup_commit and base_commit?
#125
ramsey-coding
closed
1 month ago
6
how to download one task instance from SWE-bench dataset?
#124
ramsey-coding
closed
1 month ago
1
What's the best way to browse the SWE-bench dataset?
#123
ramsey-coding
closed
1 month ago
2
Add error handling for repo cloning
#122
ALiersEL
closed
1 month ago
1
How can one participate in the SWE-bench leaderboard?
#121
yakami129
closed
1 month ago
3
Using `uv pip` instead of `pip` for significant speedup
#120
klieret
opened
1 month ago
1
Next