issues
search
bigscience-workshop
/
biomedical
Tools for curating biomedical training data for large-scale language modeling
447
stars
114
forks
source link
Add implementation for the CPI dataset
#843
Closed
mariosaenger
closed
1 year ago
mariosaenger
commented
1 year ago
Adding a Dataset
Name:
CPI
Description:
The compound-protein relationship (CPI) dataset consists of 2,613 sentences from abstracts containing \ annotations of proteins, small molecules, and their relationships
Task:
NER,RE,NEN
Paper:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0220925
Data:
https://github.com/KerstenDoering/CPI-Pipeline
License:
ISC
Motivation:
High quality NER and RE annotations
Adding a Dataset