Open enyst opened 4 days ago
Trigger by: Manual Trigger: test without PR Commit: 6ddae01acb9cee32527f2579eb24d05742d76254 Integration Tests Report (Haiku) Haiku LLM Test Results: Success rate: 100.00% (6/6) | instance_id | success | reason |
---|---|---|---|
t01_fix_simple_typo | True | ||
t02_add_bash_hello | True | ||
t03_jupyter_write_file | True | ||
t05_simple_browsing | True | ||
t04_git_staging | True | ||
t06_github_pr_browsing | True |
Integration Tests Report (DeepSeek) DeepSeek LLM Test Results: Success rate: 66.67% (4/6) | instance_id | success | reason |
---|---|---|---|
t05_simple_browsing | True | ||
t03_jupyter_write_file | True | ||
t02_add_bash_hello | True | ||
t01_fix_simple_typo | True | ||
t06_github_pr_browsing | False | The answer is not found in any message. Total messages: 4. | |
t04_git_staging | False | Failed to check for "nothing to commit, working tree clean": On branch initial-commit | |
Untracked files: | |||
(use "git add |
|||
push.log | |||
nothing added to commit but untracked files present (use "git add" to track). |
Download evaluation outputs (includes both Haiku and DeepSeek results): Download
Trigger by: Nightly Scheduled Run Commit: 6ddae01acb9cee32527f2579eb24d05742d76254 Integration Tests Report (Haiku) Haiku LLM Test Results: Success rate: 100.00% (6/6) | instance_id | success | reason |
---|---|---|---|
t03_jupyter_write_file | True | ||
t02_add_bash_hello | True | ||
t01_fix_simple_typo | True | ||
t05_simple_browsing | True | ||
t06_github_pr_browsing | True | ||
t04_git_staging | True |
Integration Tests Report (DeepSeek) DeepSeek LLM Test Results: Success rate: 83.33% (5/6) | instance_id | success | reason |
---|---|---|---|
t03_jupyter_write_file | True | ||
t05_simple_browsing | True | ||
t04_git_staging | True | ||
t02_add_bash_hello | True | ||
t01_fix_simple_typo | True | ||
t06_github_pr_browsing | False | The answer is not found in any message. Total messages: 4. |
Download evaluation outputs (includes both Haiku and DeepSeek results): Download
Trigger by: Nightly Scheduled Run Commit: 71fc115c344abbd38d7fe21a2a5c8119cfb3a1c6 Integration Tests Report (Haiku) Haiku LLM Test Results: Success rate: 100.00% (6/6) | instance_id | success | reason |
---|---|---|---|
t02_add_bash_hello | True | ||
t03_jupyter_write_file | True | ||
t01_fix_simple_typo | True | ||
t05_simple_browsing | True | ||
t06_github_pr_browsing | True | ||
t04_git_staging | True |
Integration Tests Report (DeepSeek) DeepSeek LLM Test Results: Success rate: 83.33% (5/6) | instance_id | success | reason |
---|---|---|---|
t05_simple_browsing | True | ||
t03_jupyter_write_file | True | ||
t04_git_staging | True | ||
t02_add_bash_hello | True | ||
t01_fix_simple_typo | True | ||
t06_github_pr_browsing | False | The answer is not found in any message. Total messages: 4. |
Download evaluation outputs (includes both Haiku and DeepSeek results): Download
Trigger by: Nightly Scheduled Run Commit: 91135d7e76168506545861c2b0b7faae41e25e3f Integration Tests Report (Haiku) Haiku LLM Test Results: Success rate: 100.00% (6/6) | instance_id | success | reason |
---|---|---|---|
t03_jupyter_write_file | True | ||
t02_add_bash_hello | True | ||
t01_fix_simple_typo | True | ||
t05_simple_browsing | True | ||
t04_git_staging | True | ||
t06_github_pr_browsing | True |
Integration Tests Report (DeepSeek) DeepSeek LLM Test Results: Success rate: 83.33% (5/6) | instance_id | success | reason |
---|---|---|---|
t03_jupyter_write_file | True | ||
t05_simple_browsing | True | ||
t04_git_staging | True | ||
t02_add_bash_hello | True | ||
t01_fix_simple_typo | True | ||
t06_github_pr_browsing | False | The answer is not found in any message. Total messages: 4. |
Download evaluation outputs (includes both Haiku and DeepSeek results): Download
Summary
This is a placeholder issue for nightly integration tests results.