metebalci / pdftitle

a utility to extract the title from a PDF file
GNU General Public License v3.0
131 stars 21 forks source link

Failed to identify title in a PDF #4

Closed impredicative closed 5 years ago

impredicative commented 5 years ago

I was able to get the title of https://arxiv.org/pdf/1810.04805 but not of https://arxiv.org/pdf/1901.00713.pdf

This is a very low priority request.

metebalci commented 5 years ago

v0.2 works for 1901 now, but not for 1810 (it works but the found title has no spaces). There is a new heuristic needed to find the word boundaries.