Images, Links, Milestones and small steps towards a full org-pdf converter

theottm commented 4 years ago

Hello there,

This tool is real great! Thank you for maintaining it further.

I actually came here because I wanted a tool to store part of a pdfs as images in an org file. I am realy looking forward to this implementation ! What are the milestones to reach that in your opinion ?

Maybe adding a functionality to store a region instead of just a point would be nice to begin with? For now I use org-download and I do screenshots it looks like this: Screenshot from 2020-01-28 22-26-00 The problem with this is that I lose the text information. I tried playing around with tesseract but it was not efficient enough. If somehow one could take a screenshot and select the text at point at the same time and then put all this as a link (or multiple links) in the org-file, that would be great!

In my use case, my next step would be to link my headings using org-brain so that I can have generate diagrams and navigate the network. The most beautiful thing would be to also be able to transfer the links inside the pdf to links between org-headers. I guess this goes in the direction of the long term goal you set.

Now I don't so much experience in programming with elisp, but I might be able to help here and there if there is something I can do. I wonder for example why org-pdf-get-link needs to use org-pdftools-root-dir. I don't think I would want to store all my pdf in one place or to duplicate it in this directory each time I want to link it to an org file. Maybe I can fix that if it makes sense ?

Also maybe the interface when storing a link with org-pdftools-store-link is a bit to long and should be divided in smaller functions. For example, if I want to just add a link to a page but no annotations I have to say "no" a lot of time. What do you think ?

I am not sure that the way the links are formatted now are very efficient for a heavy use of the package. I have this

[[pdftools:~/pdftools/test.pdf::20++0.00 Reminder: You haven’t performed a isearch!][pdftools:~/pdftools/test.pdf::20++0.00 Reminder: You haven’t performed a isearch!]]

which could be reduced to

[[pdftools:~/pdftools/test.pdf::20++0.00][pdftools:~/pdftools/test.pdf - Page 20]]

or even this

[[pdftools:~/pdftools/test.pdf::20++0.00][test.pdf - Page 20]]

Do you agree ?

I would open a new issue for each specific task if this is relevant.

Cheers,

Théo

fuxialexander commented 4 years ago

I think you can make a rectangle annotation then create the link?

theottm commented 4 years ago

How do you do a rectangle annotation ? Do you mean markup annotation ?

fuxialexander commented 4 years ago

Oh actually it's not possible. You might instead looking into pdf-view-extract-region-image I'll try to integrate that later if I got time.

fuxialexander commented 4 years ago

That "Reminder: You haven’t performed a isearch!" is not the expected behavior though, I'll try to fix that.

theottm commented 4 years ago

Oh thanks for the hint : I looked up the pdf-view-extract-region-image function. It is just what I need. Now I just need to use it with pdf-view-active-region-text and put the extracted image, the link to the original pdf and the text in the org file.

Maybe I can put a drawer for the text, a link for the image and another link for the reference to the pdf. I'll try that out !

rdiaz02 commented 3 years ago

@theottm, maybe the code in https://github.com/weirdNox/org-noter/issues/81 would be of help here? You get the image and the precise location (i.e., the link) inserted in the org file. But not the text of the rectangle.

If you use org-attach-screenshot (https://github.com/dfeich/org-screenshot), which I mention in that thread, you could alternatively first select the lines you want (not an arbitrary region, though) do "M-i" or "i", so the selected text becomes the text of the note. And next (maybe in a subheading of that note), use org-attach-screenshot.

fuxialexander / org-pdftools

Images, Links, Milestones and small steps towards a full org-pdf converter #12