WormBase / ACKnowledge

Author Curation to Knowledgebases
MIT License
1 stars 1 forks source link

entities not extracted for 56554 #139

Closed draciti closed 5 years ago

draciti commented 5 years ago

@valearna can you pls check if also for this paper there was an issue with the PDF to text extraction? thx entities not extracted: pas-1 pbs-3 pbs-4

valearna commented 5 years ago

pas-1: first mention in text has a non-accepted character before the keyword and it's discardes: Yes+pas-1(mg511) - the second mention is ok

pbs-3: first mention not ok: lostLvapbs-3(mg527) - second mention not ok: (C. elegans)pbs-3(mg527)

pbs-4: first mention not ok: lostLvapbs-4(mg539) - second mention not ok: (C. elegans)pbs-4(mg539)

The issue is related to pdf to text conversion problems. Whitespaces are not recognized correctly.

draciti commented 5 years ago

nothing we can do about this, closing