tmalsburg / helm-bibtex

Search and manage bibliographies in Emacs
GNU General Public License v2.0
464 stars 74 forks source link

Open collection pdf containing article #274

Open mgttlinger opened 6 years ago

mgttlinger commented 6 years ago

If I have the pdf for a whole collection it would be nice to link all entries in that collection to open this pdf at the right point. I can set the pdf with the file field but then I have to manually search for the chapter/article. I can see that automatically linking this is probably out of scope given that the point in the pdf might not even be clear depending how the pages are numbered. Possibly different arguments for the viewer are required to open a pdf at a given page depending on the viewer used. However I think there is currently no way even to set this manually.

tmalsburg commented 6 years ago

Good idea. I think it will be difficult to add this as a standard feature because people use helm/ivy-bibtex with all kinds of PDF viewers some of which may allow this, others not. However, I would be happy to add an example in the documentation showing how this could be implemented, e.g., with pdftools within Emacs. Shouldn't be too difficult.

mgttlinger commented 6 years ago

Given that I use pdftools this would be much appreciated.

tmalsburg commented 6 years ago

Oh, sorry, my message was ambiguous. I meant I would be happy to add some code in the documentation if someone would provide it to me. :) I currently don't have the time to work on this. Perhaps over the holidays. Work is too hectic at this time.

mgttlinger commented 6 years ago

Oh, OK yeah I misread that. I have a deadline on Monday. After that, I could look into that and see what I can come up with.

mgttlinger commented 5 years ago

Ok, so how would I go about implementing this?

My idea would be this:

  1. Check in bibtex-completion-open-pdf if an additional key is in the bibtex entry specifying the starting page or a range of pages is specified in the entry.
  2. Check after calling bibtex-completion-open-pdf-function if the buffer is in pdf-view-mode. (My understanding is that find file switches to that buffer allowing me to run commands in that buffer afterwards, right?)
  3. Call pdf-view-goto-label with the starting page.
tmalsburg commented 5 years ago

Sounds like a good plan but I think you have to actively switch to the PDF buffer, since the code for opening the PDF will probably be running in some helm (or ivy) buffer.

One issue: The pages of the PDF and the pages numbers often diverge since (for instance) title pages are counted in by the PDF viewer but not for the purposes of table of contents. This means that pdf-view-goto-label might land us on a page that's earlier than the actual article or abstract. Some PDF viewers are aware of the two separate ways to count pages but I'm not sure Emacs' pdf viewer is among them.

mgttlinger commented 5 years ago

I think pdf-tools makes this distinction with pdf-view-goto-page vs pdf-view-goto-label. The later seemed to me to often be the "logical" page number whereas the page is always the "physical" page. Sometimes the label is also the "physical" page though.

For correcting this offset my idea was to have an additional field startpage in the bib entry (similar to file) specifying the physical page where an article starts in case the labels are incorrect overwriting the page field for this purpose.

tmalsburg commented 5 years ago

For correcting this offset my idea was to have an additional field startpage in the bib entry

As an alternative to a manual offset you could search for the title of the article in the PDF and jump to the hit that is closest to the page number from the BibTeX entry. Not sure how robust this would be but perhaps worth a try.

mgttlinger commented 5 years ago

Ok, I have a preliminary implementation for this but the difficulty is to detect if a pdf is a journal pdf or not. My current idea is to check the total numbers of pages vs the start page but that seems a bit fragile.

Do you want to have the things I built in helm-bibtex or rather only document e.g. in the Readme how a user could implement that? The issue with the latter aproach imho is that one needs to redefine helm-bibtex functions which is likely to break with package updates.

mgttlinger commented 5 years ago

To clarify what I would want to include:

  1. After calling the user specified function for opening pdfs, the following functions (in the case of find-file at least) are run in the buffer the pdf is opened in.
  2. Check if the buffer is in pdf-view-mode
  3. If so try to find the correct page and move to that

That approach could be extended to other in emacs viewers

tmalsburg commented 5 years ago

the difficulty is to detect if a pdf is a journal pdf or not.

Perhaps I misunderstand, but doesn't the entry type (article or inproceedings or ...) give you this information?

Do you want to have the things I built in helm-bibtex or rather only document

Not sure. Perhaps you could make a PR for bibtex-completion and then I can decide. If I decide to just add a section to the documentation, I could do the necessary work there myself.

mgttlinger commented 5 years ago

Perhaps I misunderstand, but doesn't the entry type (article or inproceedings or ...) give you this information?

Well that would be the case if I always have the whole journal present but most of the times I only have the single article pdf from the authors page.