Open gabriben opened 3 years ago
Hi Gabriben,
Apologies, only just saw your message.
There are two folders, labels and text.
the "text" contains files that have PubMed Abstracts, split one sentence per line (already tokenized). The file names are the PubMed IDs.
The "labels" contains corresponding labels for each text file (both will be named with the same PubMed ID). The file format is as follows: they contain multiple labels per sentence.
The are sentence labels are separated by "<", and the multi-labels for each sentence is separated by "AND".
Hope that helps. let me know otherwise.
Hi,
I'd like to use your dataset to reproduce some results in the ML-NET paper, but I am having trouble understanding how the label text files should be read.
Thank you