KennethEnevoldsen / scandinavian-embedding-benchmark

A Scandinavian Benchmark for sentence embeddings
https://kennethenevoldsen.github.io/scandinavian-embedding-benchmark/
MIT License
27 stars 3 forks source link
benchmark low-resource-nlp natural-language-processing nlp scandinavian

Scandinavian Embedding Benchmark

PyPI Python Version documentation Tests Ruff DOI

A benchmark for evaluating sentence/document embeddings of Scandinavian language models.

Installation

You can install the Scandinavian Embedding Benchmark (seb) via pip from PyPI:

pip install seb

To see more examples, see the documentation.

πŸ“– Documentation

Documentation
πŸ”§ Installation Installation instructions on how to install this package
πŸ‘©β€πŸ’» Usage Introduction on how to use the package
πŸ“– Documentation A minimal and developing documentation

πŸ’¬ Where to ask questions

Type
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests & Ideas GitHub Issue Tracker
πŸ‘©β€πŸ’» Usage Questions GitHub Discussions
πŸ—― General Discussion GitHub Discussions

Citation

To cite this work please refer to the following article:

Enevoldsen, K., Kardos, M., Muennighoff, N., & Nielbo, K. (2024). The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding. https://openreview.net/forum?id=pJl_i7HIA72

or use the following BibTeX:

@misc{enevoldsen2024scandinavian,
      title={The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding}, 
      author={Kenneth Enevoldsen and MΓ‘rton Kardos and Niklas Muennighoff and Kristoffer Laigaard Nielbo},
      year={2024},
      eprint={2406.02396},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}