Saber is a deep-learning based tool for information extraction in the biomedical domain. Pull requests are welcome! Note: this is a work in progress. Many things are broken, and the codebase is not stable.
should download the dataset to ~/saber/datasets, convert it to the CoNLL 2003 format, and load it into a Dataset object. Furthermore, if this URL is ever supplied again, load_dataset() should use the cached version of the dataset in ~/saber/datasets.
Considering pubannotation.org contains most of the most popular datasets for BioNLP, this would nearly eliminate the need to maintain datasets locally.
Saber.load_dataset()
should be able to pull from pubannotation.org given a projects URL.E.g.
should download the dataset to
~/saber/datasets
, convert it to the CoNLL 2003 format, and load it into aDataset
object. Furthermore, if this URL is ever supplied again,load_dataset()
should use the cached version of the dataset in~/saber/datasets
.Considering pubannotation.org contains most of the most popular datasets for BioNLP, this would nearly eliminate the need to maintain datasets locally.