issues
search
bigscience-workshop
/
biomedical
Tools for curating biomedical training data for large-scale language modeling
454
stars
116
forks
source link
Create loader for CARDIO:DE
#880
Closed
nachollorca
closed
1 year ago
nachollorca
commented
1 year ago
Adding a Dataset
Name:
CARDIO:DE
Description:
first freely available and distributable large German clinical corpus from the cardiovascular domain.
Task:
NER
Paper:
https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/AFYQDY
Data:
https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/AFYQDY
License:
accesible under Data User Agreement
Motivation:
As the second major dataset from real clinical data in German, CARDIO:DE offers unique opportunities for collaborative and reproducible research on natural language processing models for German clinical texts.
Adding a Dataset