Closed KennethEnevoldsen closed 7 months ago
This update encompasses a variety of improvements and dataset updates across different tasks, focusing on sentiment analysis, language identification, and retrieval tasks. Key changes include the introduction of bootstrapping for average rank computation, updates to task versions and timestamps, and enhancements in evaluation metrics. Task descriptions and metadata have also been refined for better clarity and precision.
Files | Summary |
---|---|
docs/update_benchmark_tables.py |
Added bootstrapping for average rank computation; updated compute_avg_rank . |
src/seb/cache/.../all-MiniLM-L6-v2/*.json ,src/seb/cache/.../sentence-transformers__all-MiniLM-L6-v2/*.json |
Added new datasets, updated evaluation scores, task versions, and timestamps. |
src/seb/interfaces/model.py |
Simplified exception handling in encode_queries and encode_corpus . |
src/seb/mteb_tasks/retrieval/norquad.py ,src/seb/registered_tasks/.../*.py |
Updated task attributes, added custom subclasses for improved descriptions. |
makefile , pyproject.toml |
Adjusted linting commands and updated dependencies in pyproject.toml . |
šāØ
In the code's garden, under the moon's gleam,
Changes sprout like flowers, in a coder's dream.
With each line refined, and each dataset new,
Our AI's mind grows, learning more for you.
So hop along, friends, through this digital spree,
Where code meets poetry, in harmony. šš¾
Objective | Addressed | Explanation |
---|---|---|
Extend the dataset to include other Scandinavian languages | ā | The focus was on dataset updates and enhancements, not on adding new languages. |
Check the availability of translations for specific resources | ā | Translation availability wasn't explicitly mentioned in the changes. |
Integrate resources for Greenlandic, Icelandic, and Faroese | ā | The update primarily focused on existing tasks and datasets. |
Potentially include Finnish datasets | ā | The changes did not involve adding Finnish datasets. |
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Fixing broken links from ScandEval. These a temporary fixes @x-tabdeveloping as I imagine, we will update Scandinavian emb. benchmark to the latest version of MTEB soon (once it is settled).
Fixes: #173
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Refactor
hf_hub_name
attribute in theNorQuadRetrieval
class for clearer identification.