Welcome to the official repository for the Swedish Medical Benchmark! This project aims to revolutionize how we assess and develop AI models in the medical domain, specifically tailored for the Swedish language. With your help, we can create a more inclusive, accurate, and impactful AI in healthcare. Let's make AI work for everyone!
This project focuses on three primary goals:
Making existing benchmarks accessible to the Swedish-speaking medical community is crucial. This step involves:
Identifying key medical benchmarks in AI. Translating these benchmarks into Swedish. Ensuring the translations maintain the clinical integrity of the original benchmarks. Create New Benchmark for Swedish π οΈπ
See more information regarding implemented benchamrks in the Benchmarks readme file.
Metric | Eir | Swe-PubMedQA-100 |
---|---|---|
Total Questions | 100 π | 100 π |
Correct Answers | 50 β | - |
Incorrect Answers | 50 β | - |
Malformed Answers | 0 π« | - |
Accuracy | 50% π― | - |
Number of yes | 56 βοΈ | 60 βοΈ |
Number of no | 29 β | 30 β |
Number of maybe | 15 β | 10 β |
Implementing a standardized evaluation framework. Encouraging the submission of AI models for testing. Publishing results to foster transparency and continuous improvement. Contributing π€ Your expertise and enthusiasm can drive this project forward. Here's how you can contribute:
Fork this repository to your account. Pick a task from the issues tab that resonates with your skills and interests. Follow the contribution guidelines in the CONTRIBUTING.md file for detailed instructions on how to make your contributions count. Stay Connected π¬ Join our community on Discord for discussions, updates, and collaboration opportunities. Together, we can make a difference in healthcare AI! https://discord.gg/AgDx34t2
Note: Make sure that you have Python 3.10 or higher installed on your machine.
First you nοΈeed to install the requirements:
pip install -r requirements.txt
Then you can run the file associated with the LLM model, make sure to adjust the configuration in the file to your needs. For instance:
python run_llm/huggingface.py
For more detailed metrics run the evaluation script:
python evaluate_performance.py
Note: The scripts have to be run from the root directory of the project.