MicheleCotrufo / pdf-renamer

A python tool to automatically rename the pdf files of scientific publications by looking up the publication metadata on the web.
132 stars 21 forks source link

The cross-references in pdf files can't be used after the file is renamed #14

Closed js4561207 closed 2 months ago

js4561207 commented 1 year ago

Version: python 3.11 pdf-renamer 1.0rc9 pdf-tocgen 1.3.3 pdf2bib 1.1rc4 pdf2doi 1.5rc8 pdfminer.six 20221105 pdftitle 0.11 image

As shown in video below, the cross-references in pdf files can't be used after the file is renamed. When i click the "fig.1". it don't skip to corresponding page.(the same for cites) Doi links works well. Here is the video link. https://clipchamp.com/watch/B9s3MLxnqaM

MicheleCotrufo commented 4 months ago

While I m not entirely sure why this happens, I just found some (possibly related) bug in pdf2doi. Can you try updating pdf2doi to the version 1.6, and see if this problem persists? Btw, read also the warning at the beginning of the pdf2doi readme https://github.com/MicheleCotrufo/pdf2doi for more details on this bug, and how to fix any affected pdf file

principejavier commented 2 months ago

I can confirm this bug using pdf-renamer 1.1 and pdf2doi 1.5.1.

I can also confirm that cross-references are correct using pdf-renamer 1.2 and pdf2doi 1.6. Size of files is still different, in some cases larger in some cases smaller but internal links work.

What I did not observe is the bug mentioned in the warning of pdf2doi page using pdf2doi 1.5.1. The pdf files I tried (some papers) were untouched when running pdf2doi (I was NOT using -nostore) and I inspected metadata to check doi was not there.

BTW, many thans for the sw, very useful!!

MicheleCotrufo commented 2 months ago

Thanks for the feedback @principejavier !