akaalias / obsidian-extract-pdf-highlights

Extract highlights, underlines and annotations from your PDFs into Obsidian
212 stars 10 forks source link

Allowing "screenshot" of the page inside a rectangle/square box to capture images/charts/diagrams #17

Open dummifiedme opened 2 years ago

dummifiedme commented 2 years ago

Benefits

Having a way to include the screenshot of the region inside a created square(or rectangle) would help in

Image naming and handling

The extracted images could be sent directly to the asset/image/attachment location in the vault while assigning a incremental naming to them or based on the page numbers or plain random just like what obsidian does when pasting.

Example

PDF sample

image

Outpul sample

image

I had implemented it in a crude way but using Python, and I am a newbie to programming. If you could work on something like this, it would be really helpful as I have a lot of diagrams that I now manually screenshot and paste. This could make life easier for everyone who need to extract images or diagrams.

Also:

  1. A post I wrote on Forum : https://forum.obsidian.md/t/discussion-extracting-annotations-from-pdfs/24411
  2. A pdfAnnotate library that I found which might be useful : https://github.com/highkite/pdfAnnotate
huyz commented 1 year ago

Is this the same ask as #15?

avigilante commented 1 year ago

Also check this library: https://github.com/mgmeyers/pdfannots2json