issues
search
BirgerMoell
/
swedish-medical-benchmark
8
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Fixes the benchmark set up
#11
northern-64bit
closed
1 month ago
1
Adds LLM runs under results folder
#10
northern-64bit
closed
1 month ago
0
Add the remaining PubMedQA-L-SWE translations
#9
northern-64bit
closed
2 months ago
1
Adds API accesable LLM:s (like GPT-4) with LiteLLM + fixes
#8
northern-64bit
closed
2 months ago
1
Refactor code structure + detailed metrics
#7
northern-64bit
closed
2 months ago
1
Refactor eval code
#6
LinusJohaansson
closed
2 months ago
0
Benchmark against GPT-4
#5
BirgerMoell
opened
2 months ago
0
Keep the benchmark balanced
#4
BirgerMoell
opened
2 months ago
0
Chatbot Arena style human evaluation
#3
BirgerMoell
opened
3 months ago
0
Adding more items to PubMedQA-L-SWE
#2
northern-64bit
closed
3 months ago
1
Evaluate benchmarks from BioMistral to decide which ones are good to translate use
#1
BirgerMoell
opened
3 months ago
5