Datasets - Githubissues

texttechnologylab / LSTMVoter

GNU Affero General Public License v3.0

7 stars 1 forks source link

Open Mahmedturk opened 5 years ago

Mahmedturk commented 5 years ago

Hi @hemati

Where can I download the data from?

In your paper you have mentioned that CEMP corpus has 30,000 patents; however, in the original paper here there are 21000 patents? or am i missing out on something? [https://pdfs.semanticscholar.org/388c/d26d2d70d9b2d166321daa7a15ae6f2bbb19.pdf]

hemati commented 5 years ago

Dear @Mahmedturk we have also added the BioCreative IV Corpus (https://biocreative.bioinformatics.udel.edu/resources/biocreative-iv/chemdner-corpus/) as the tasks where similar.

Mahmedturk commented 5 years ago

so, you have experimented with two datasets, bc4chemdner and bcv.5chemdner(combined with bc4chemdner)?