Xunius / Menotexport

Python solution to export annotations from your Mendeley library.
GNU General Public License v3.0
124 stars 20 forks source link

Export loses outlines #22

Open ghost opened 6 years ago

ghost commented 6 years ago

Hi thanks for this great tool!

I'm trying to export my Mendeley library and noticing that my outlines/bookmarks are missing from the output. These are useful for me to navigate around my documents. Is it possible to retain these in the export? Thanks for your help.

Xunius commented 6 years ago

Hi apenewberry,

I didn't know about the outlines/bookmarks feature of Mendeley. I just updated my version to 1.19 which should be latest, but still I don't quite get it. Can you give me some more hint regarding the outlines you are referring to? What version are you using, is it windows or Mac or Linux?

ghost commented 6 years ago

I'm using Mendeley Desktop 1.19 on Ubuntu 17.10.

I think they're called 'outlines' or 'bookmarks' in the PDF format, but Mendeley displays them in the Contents tab of the sidebar on the right. Screenshot below.

I've been looking at some resources, but haven't figured out how copy the outlines in:

image

Xunius commented 6 years ago

I see. So these are embedded in the PDFs aren't they? And they appear in more recent publications not in older ones. So how do you want them to be exported? Because the pdf files are merely copied to the target folder, if a pdf has these, the export should too.

ghost commented 6 years ago

Ok I think I must've made a mistake. I'll look into this further and update.

ghost commented 6 years ago

It looks like the script copies the bookmarks for most PDFs but misses them on (at least one) others. I'm not sure what the distinguishing characteristic is for when it misses them. I think I'll just deal with the few missed examples manually as they come up. Thanks for the tool and for responding so quickly as well -- cheers!

Xunius commented 6 years ago

Hmm it's strange that it fails for certain pdfs. Could it have something to do with the pdf viewer software?

ghost commented 6 years ago

I'm not sure. Apparently there are multiple ways to embed bookmarks in PDFs, which can make reading them complicated, but it's getting over my head at that point.

syu-id commented 4 years ago

Hi, this is my quick solution which seems to work without problems:

# Copy the root (document catalog) except for /Pages
# PDF Reference, Sixth Edition, version 1.7, p.137
# https://www.adobe.com/devnet/pdf/pdf_reference_archive.html
for k,v in inpdf.trailer["/Root"].items():
    if k.getObject() != "/Pages":
        outpdf._root_object.update({k: v})
diff --git a/lib/exportpdf.py b/lib/exportpdf.py
index 5072681..aaffe23 100644
--- a/lib/exportpdf.py
+++ b/lib/exportpdf.py
@@ -154,6 +154,14 @@ def exportPdf(fin,outdir,annotations,verbose):

         outpdf.addPage(inpg)

+
+    # Copy the root (document catalog) except for /Pages
+    # PDF Reference, Sixth Edition, version 1.7, p.137
+    # https://www.adobe.com/devnet/pdf/pdf_reference_archive.html
+    for k,v in inpdf.trailer["/Root"].items():
+        if k.getObject() != "/Pages":
+            outpdf._root_object.update({k: v})
+
     #-----------------------Save-----------------------
     filename=annotations.filename
     if not os.path.isdir(outdir):
Xunius commented 4 years ago

@rongmu thanks for providing the fix. I've incorporated your code. As I'm not going to test it myself I've put the snippet in a try block.