JohnGiorgi / seq2rel-ds

This is a companion repository to seq2rel (https://github.com/JohnGiorgi/seq2rel) which aims to make it easy to generate training data.
5 stars 1 forks source link

Add tests for parse_pubtator #18

Open JohnGiorgi opened 3 years ago

JohnGiorgi commented 3 years ago

There are a couple of strange formatting choices used in certain corpora that are in the pubtator format (or a pubtator like format). Examples include:

It would be great to add individual unit tests for each of these cases (and any other we discover) to ensure that we are handling them properly.