multilingual-evaluation Search Results

SEACrowd/seacrowd-datahub #730

Create dataset loader for META MMLU (Thai)

## Adding a Dataset - **Name:** *multilingual_mmlu_th* - **Dataset Description:** *Thai mmlu from META* - **Dataset URL:** *[original URL of the dataset](https://huggingface.co/datasets/meta-llama/…

wannaphong updated 1 month ago

embeddings-benchmark/mteb #1445

"KeyError: 'document' not found and no similar keys were fou…

With [HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1](https://huggingface.co/HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1), I have the following error: ``` Loader not specified for m…

LeMoussel updated 4 days ago

embeddings-benchmark/mteb #1467

Issues with benchmarks.py

- Shouldn't this be Touche2020: https://github.com/embeddings-benchmark/mteb/blob/a988fef10cb73e2a35238f14f5c59a6615bbdaeb/mteb/benchmarks/benchmarks.py#L189 not the new one like here https://github.c…

Muennighoff updated 1 hour ago

tjunlp-lab/Awesome-LLMs-Evaluation-Papers #7

SeaEval: Multilingual LLM Evaluation

Please note our paper on evaluation, which could be an important building block for multilingual evaluation and cultural understanding. [SeaEval for Multilingual Foundation Models: From Cross-Lingu…

BinWang28 updated 1 year ago

LiveCodeBench/LiveCodeBench #19

Supports the evaluation of multilingual data sets

Hello, the data set of livecodebench is Python, would you consider supporting multi-language data set evaluation? Especially Java. thanks.

kartikzheng updated 5 months ago

rmusser01/tldw #195

RAG Feature: Allow for Evaluation of RAG implementation

Issue is to track evaluation of RAG implementations. Frameworks: - RAGEval - https://github.com/OpenBMB/RAGEval - https://arxiv.org/pdf/2408.01262 - AutoRAG - https://github.com/Marker-Inc-K…

rmusser01 updated 1 week ago

huggingface/lighteval #373

[EVAL]: Add more African Benchmarks

## Evaluation short description - Why is this evaluation interesting? This focuses on 16 African languages, evaluated on three knowledge QA and reasoning tasks such as AfriMMLU, AfriMGSM and AfriXNL…

dadelani updated 1 week ago

dynamic-superb/dynamic-superb #151

[Task] Multilingual Speech to Speech Translation

# Task Name Multilingual Speech to Speech Translation (s2st): converting speech from one language directly into speech in another language. This task requires the model to have strong multilingual …

wanchichen updated 4 months ago

andrewyng/translation-agent #18

Exploring the Potential of High-Quality Synthetic Datasets U…

**Background** In the field of multilingual large models, especially for non-English corpora, there is often a problem of insufficient data quantity and poor quality. High-quality training data is cr…

universea updated 4 months ago

castorini/ura-projects #36

Repro: Synthetic multilingual retrieval models on MIRACL

We have a new project involving multilingual retrieval and reproduction and we are looking for 2 URA students to work together. Feel free to reach out on Slack or email us at nandant@gmail.com, xzh…

thakur-nandan updated 7 months ago

743 results for multilingual-evaluation

743 results
for multilingual-evaluation