BirgerMoell / swedish-medical-benchmark

8 stars 0 forks source link

πŸ‡ΈπŸ‡ͺ Swedish Medical Benchmark πŸ₯πŸ’»

Welcome to the official repository for the Swedish Medical Benchmark! This project aims to revolutionize how we assess and develop AI models in the medical domain, specifically tailored for the Swedish language. With your help, we can create a more inclusive, accurate, and impactful AI in healthcare. Let's make AI work for everyone!

Goals 🎯

This project focuses on three primary goals:

Translate Benchmarks to Swedish πŸ“šβž‘οΈπŸ‡ΈπŸ‡ͺ

Making existing benchmarks accessible to the Swedish-speaking medical community is crucial. This step involves:

Identifying key medical benchmarks in AI. Translating these benchmarks into Swedish. Ensuring the translations maintain the clinical integrity of the original benchmarks. Create New Benchmark for Swedish πŸ› οΈπŸ†•

Benchmarks

See more information regarding implemented benchamrks in the Benchmarks readme file.

Develop benchmarks specifically for the Swedish context, incorporating

Compare Model Performance on the Benchmark πŸ“ŠπŸ”

Metric Eir Swe-PubMedQA-100
Total Questions 100 πŸ“‹ 100 πŸ“‹
Correct Answers 50 βœ… -
Incorrect Answers 50 ❌ -
Malformed Answers 0 🚫 -
Accuracy 50% 🎯 -
Number of yes 56 βœ”οΈ 60 βœ”οΈ
Number of no 29 ❎ 30 ❎
Number of maybe 15 βž– 10 βž–

Evaluating AI models on these benchmarks to understand their effectiveness and areas for improvement

Implementing a standardized evaluation framework. Encouraging the submission of AI models for testing. Publishing results to foster transparency and continuous improvement. Contributing 🀝 Your expertise and enthusiasm can drive this project forward. Here's how you can contribute:

Get Started πŸš€

Fork this repository to your account. Pick a task from the issues tab that resonates with your skills and interests. Follow the contribution guidelines in the CONTRIBUTING.md file for detailed instructions on how to make your contributions count. Stay Connected πŸ’¬ Join our community on Discord for discussions, updates, and collaboration opportunities. Together, we can make a difference in healthcare AI! https://discord.gg/AgDx34t2

Usage πŸ› 

Note: Make sure that you have Python 3.10 or higher installed on your machine.

First you n️eed to install the requirements:

pip install -r requirements.txt

Then you can run the file associated with the LLM model, make sure to adjust the configuration in the file to your needs. For instance:

python run_llm/huggingface.py

For more detailed metrics run the evaluation script:

python evaluate_performance.py

Note: The scripts have to be run from the root directory of the project.