akaalias / obsidian-extract-pdf-highlights

Extract highlights, underlines and annotations from your PDFs into Obsidian
212 stars 10 forks source link

Problem with two-column PDF #14

Open Chris-mik opened 2 years ago

Chris-mik commented 2 years ago

Thank you @akaalias for this amazing contribution! I have observed an issue regarding the ordering of the extracted highlights from a pdf where the text is arranged in two separate columns where text flow in each page runs from the 1st column to the 2nd. Specifically, it seems that the plugin extracts highlights in the order they first appear in the pdf but when it comes to a two-column pdf (which is often the case for research articles) this means that the flow of the actual highlights is discontinued in the note of the extracted highlights. For example, in the first page of a two-column pdf I highlight the text as follows:

  1. Column 1
    • lines 10 to 15
    • line 20
  2. Column 2
    • lines 2 to 5
    • line 18

In the extracted note, the ordering of the highlighted lines is shown as:

So, I am wondering whether any workaround can be made to tackle this issue! Thank again!