metebalci / pdftitle

a utility to extract the title from a PDF file
GNU General Public License v3.0
131 stars 21 forks source link

Remove duplicate spaces when returning title #9

Closed jakob1379 closed 4 years ago

jakob1379 commented 4 years ago

I found pdftitle which have proven to be a true gem! From time to time it to returns duplicate spaces in the title, which can easily be circumvented in two ways

  1. using regular expression, just before returning title in get_title_from_io just add a

    titlte = re.sub(' +' , ' ', title)
  2. using string manipulation

    title = ' '.join(title.split())

    Preferably a check with ' ' in title should be performed to check whether the correction should be executed or not.

metebalci commented 4 years ago

Would you like to send a patch ? or I can fix it. I think there is no need for a check and I would prefer the second method with split and join.

jakob1379 commented 4 years ago

I just made a pull request :)

metebalci commented 4 years ago

Thanks. Merged and released with 0.4.