inukshuk / anystyle

Fast citation reference parsing
https://anystyle.io
Other
1.06k stars 91 forks source link

One word got lost in the title #183

Closed dioubernardo closed 2 years ago

dioubernardo commented 2 years ago

In this PDF https://iase-web.org/documents/papers/icots5/Topic1m.pdf

image

inukshuk commented 2 years ago

That's because there's a missing space in the input. The parser splits tokens on spaces, so the 'The' is part of date segment (and dropped during normalisation).