benchmark-dataset Search Results

zxzm-zak/AlignBot #2

Will the benchmark dataset be public?

Hi. I really like your work and try to reimplement it using your code. However, I am a little confused about the general pipeline of your code. After building the environment, I should have the LLaVA …

peppersaltyy updated 4 days ago

KevinMusgrave/pytorch-metric-learning #722

Addition of popular benchmark datasets

Hi, I find that it's nice to have a few benchmark datasets integrated into libraries for easier research. My feature request boils down to the implementation of a few image retrieval datasets, name…

ir2718 updated 1 month ago

Arize-ai/phoenix #5266

Benchmark existing models on hallucinations dataset

Jgilhuly updated 2 weeks ago

OpShin/uplc #38

Proposal: Cross-Implementation Benchmarking Dataset for Plut…

I'm working on a [C++ implementation of Plutus](https://github.com/sierkov/daedalus-turbo/tree/main/lib/dt/plutus) aimed at optimizing batch synchronization. We'd like to benchmark our implementation …

sierkov updated 1 week ago

ttsds/ttsds #4

Why benchmark using different evaluation dataset for differe…

Hi, thank you for your amazing work! I noticed that in `examples/tts_arena/result.csv`, different noise datasets and reference datasets are used for different models: ```csv Hubert Token,2,pheme.tar…

Mira1sen updated 1 hour ago

HeliosLang/uplc #3

Proposal: Cross-Implementation Benchmarking Dataset for Plut…

sierkov updated 1 week ago

open-spaced-repetition/srs-benchmark #129

Discussing the new dataset and benchmark

https://github.com/ankitects/anki/pull/3511#issuecomment-2444087066 I have a few questions regarding that 1) Will we keep using the default parameters based on the old dataset or on the new one? I…

Expertium updated 2 weeks ago

IntersectMBO/plutus #6626

Proposal: Cross-Implementation Benchmarking Dataset for Plut…

### Describe the feature you'd like I'm working on a [C++ implementation of Plutus](https://github.com/sierkov/daedalus-turbo/tree/main/lib/dt/plutus) aimed at optimizing batch synchronization. We'…

sierkov updated 5 hours ago

tingofurro/summac #22

Hallucination | benchmarking on RAG truth dataset

I tried running both the models (ZS and Conv) on RAG Truth datasets (https://github.com/ParticleMedia/RAGTruth) The steps I did was filtered the RAGTruth Dataset on summary tasks. And fed them in th…

karrtikiyer updated 1 month ago

ScandEval/ScandEval #257

[BENCHMARK DATASET REQUEST] Högskoleprovet

### Dataset name hogskoleprovet ### Dataset link https://www.hogskoleprovet.nu/gamla-hogskoleprov/ ### Dataset languages - [ ] Danish - [X] Swedish - [ ] Norwegian (Bokmål or Nynorsk)…

saattrupdan updated 1 week ago

1000+ results for benchmark-dataset

1000+ results
for benchmark-dataset