-
Just one issue to track progress till the next release and collaborate on creating the right issues. Feel free to edit this issue/comment on changes.
Scope of Next Release:
- [ ] For each paper cr…
-
Title.
Benchmarks:
Summarization
- [x] G-Eval
- [ ] SummHay - https://arxiv.org/abs/2407.01370v1 & https://github.com/salesforce/summary-of-a-haystack
- https://arxiv.org/html/2403.19889v1
R…
-
(All these tasks will probably require prompt engineering, model-specific. Consider doing evaluation, either through external metrics or human validation)
**Number of examples per dataset: cap at 5…
rbroc updated
2 weeks ago
-
Currently, when searching using `allow_evaluation_errors` and all of the programs error out on evaluation (`interpret` errors out), the return value of `synth` is marked as a `suboptimal_program`. Thi…
-
# Evaluation & Datasets — State of Open Source AI Book
[https://book.premai.io/state-of-open-source-ai/eval-datasets/](https://book.premai.io/state-of-open-source-ai/eval-datasets/)
-
### 起始日期 | Start Date
9/3/2024
### 实现PR | Implementation PR
_No response_
### 相关Issues | Reference Issues
_No response_
### 摘要 | Summary
When using vLLM to optimally utilize GPU space for faste…
-
### Have you completed your first issue?
- [X] I have completed my first issue
### Guidelines
- [X] I have read the guidelines
- [X] I have the link to my latest merged PR
### Latest Merged PR Lin…
-
**python3 evaluation.py --test_dir testdata/ --model_path ckpt/D2Former_epoch_77_0.055 --save_dir saved_tracks_best**
audio_path: testdata/noisy/p287_001_noisy.wav
Traceback (most recent call last):…
-
# Project Evaluation Checklist
## Minimal Technical Requirements
- [x] Ensure the frontend is developed using pure vanilla JavaScript (unless overridden by a module).
- [x] Make the website a Single…
-