AttributeError: 'PDFObjRef' object has no attribute 'strip'

jdmonaco / pdf-title-rename

A script to batch rename PDF files based on metadata/XMP title and author

113 stars 30 forks source link

AttributeError: 'PDFObjRef' object has no attribute 'strip' #1

Closed karkraeg closed 7 years ago

karkraeg commented 7 years ago

Hi! Thanks for making this script, it works nice most of the time! Sometimes though I get a AttributeError: 'PDFObjRef' object has no attribute 'strip' Error (most of the time when PDFs have not Title Metadata) even though my PDF has Metadata (and no speacial Characters like: which also cause this error to happen). Anything that can be done here to prevent those errors? Thanks!

jdmonaco commented 7 years ago

Could you copy the full error traceback and paste as a reply here? Also, it would be helpful if you could you send a link to a PDF that has caused this error. It sounds like some piece of metadata is staying as a PDF object and not being converted to a string when expected by the script.

I'm glad the script has been at least sometimes useful to you!

karkraeg commented 7 years ago

That would be

Traceback (most recent call last):
  File "/usr/bin/pdf-title-rename.py", line 170, in <module>
    sys.exit(RenamePDFsByTitle(args).main())
  File "/usr/bin/pdf-title-rename.py", line 49, in main
    title, author = self._get_info(f)
  File "/usr/bin/pdf-title-rename.py", line 85, in _get_info
    if 'Title' in info and info['Title'].strip() != 'untitled':
AttributeError: 'PDFObjRef' object has no attribute 'strip'

as an example. The file that outputs this error can be downloaded here: https://transfer.sh/yRQ3s/314.pdf

jdmonaco commented 7 years ago

Great, thanks. As I suspected, parsing the PDF created PDFObjRef objects instead of strings. The script now checks for these objects and resolves them into the proper values.

I also took the opportunity to improve other parts of the script including the console output, so processing your file now looks like:

Processing "314.pdf":
 -- Renaming to "Bibliometrics and Citation Analysis.pdf"
Processed 1 files:
 - Renamed: 1
 - Missing metadata: 0
 - Errors: 0

This should close the issue. Let me know if I managed to break something else though.